Summary

This is a common garden experiment where individual fish (FISH_ID) were exposed to four different TEMPERATURE treatments (i.e., 27\(^\circ\), 28.5\(^\circ\), 30\(^\circ\), and 31.5\(^\circ\)). Fish sampled in this experiment were collected from three different sets of POPULATIONS that were collected from two different REGIONS; a low- and a high-latitude region (i.e., three populations from each region, six different populations in total).

Fish were first sampled at 27C and then subsequently at higher temperatures. Once all fish were tested at the set treatment temperature, the treatment temperature was increase 1.5\(^\circ\) over three days. Fish were then given an additional five days to adjust to the temperature increase.

To determine the MAX metabolic rate fish were placed in a swim tunnel for ten minutes. The first five minutes were used to slowly increase the water flow within the swim tunnel until fish reached the point where fish would alternate between pectoral swimming and lateral body undulations (i.e., gait change). The water flow within the swim tunnel was maintained at gait change speeds for an additional five minutes. Afterwards fish were immediately placed within a designated respirometry CHAMBER with air saturation rates being monitored for 3.5-4 hours on a four minute measure, three minutes flush, and five second wait cycle.

Maximum metabolic rate was determined by extracting the steepest sixty second interval slope from the first (sometimes, but rarely, second or third) measurement period. RESTING metabolic rate was determined by extacrting the ten shallowest slopes that were recorded over the length of the experiment.

Fish were sampled in random order. The order that fish were sampled in determined the SUMP/SYSTEM fish run on as well as the chamber they were placed in (i.e., the first fish sampled went into chamber 1 on sump/system 1, the second fish into chamber 1 on sump/system 2, the third fish into chamber 2 on sump/system 1 etc.).

Immunocompetence was tested via phytohaemaglutinin (PHA) swelling assays at the same four experimental temperatures metabolic performance was tested at. To perform the assay fish were injected with 0.03 mL of PHA subcutaneously in the caudal peduncle. Thickness of injection site was measured pre-injection as well as 18-24hour post-injection. PHA produces a localized, cell-mediated response (e.g., inflammation, T-cell proliferation, etc). The change in thickness between measurement periods was used as an proxy for immunocompetence.

2-weeks after live animal testing concluded blood and tissue samples were collected from each fish. White muscle tissue samples were used to assess enzyme activation levels of 2 different enzymes including, lactate dehydrogenase (LDH; anaerobic) and citrate synthase (CS; aerobic). Blood samples were used to determine hemaetocrit ratios.

Acanthochromismap

Glossary of respirometry terms

EXP_FISH_ID Combined FISH_ID with TEMPERATURE the fish was tested at
FISH_ID Unique alphamueric code provided to fish
POPULATION Population/Reef the fish was collected from
REGION Region (i.e. core or leading) fish was collected from
TEMPERATURE Temperature fish was tested at
MASS Mass of the fish
RESTING_DATE Date the resting metabolic rate was recorded for each fish
RESTING_CHAMBER Respirometry chamber the fish was tested in for RMR
RESTING_SYSTEM Respirometry system (i.e. dell or asus) fish was tests with
RESTING_SUMP Respirometry sump (i.e., 1 or 2) fish was tested in
RESTING_AM_PM Time of day (i.e. morning or afternoon) fish was tested
RESTING_START_TIME Time that the fish was placed inside the repirometry chamber
RESTING Resting metabolic rate (RMR)
RESTING_MgO2.hr Resting metabolic rate divded mass
MAX_DATE Date that maximum metabolic rate was recored
MAX_CHAMBER Respirometry chamber the fish was tested in for MMR
MAX_SYSTEM Respirometry system fish was test with for MMR
MAX_SUMP Respirometry sump (i.e., 1 or 2) fish was tested in for MMR
MAX_AM_PM Time of day (i.e. morning or afternoon) fish was tested for MMR
MAX_START_TIME Time that the fish was placed inside the chamber for MMR
MAX Maximum metabolic rate (MMR)
MAX_MgO2.hr Maximum metabolic rate divided by mass
FAS Factorial metabolic rate (MMR/RMR)
NAS Net metabolic rate (MMR - RMR)
MgO2.hr_Net Net metaboic rate divided by mass
Swim.performance Fish swim performance in swim tunnel (i.e., good, okay, poor)
Notes Additional experimental notes
MASS_CENTERED Mass of fish (centered)

Load packages

Lets start by loading the packages that are needed

library(tidyverse) # data manipulation
library(janitor) # data manipulation
library(plyr) # data manipulation
library(dplyr) # data manipulation
library(lubridate) # data manipulation - specifically time data
library(chron) # data manipulation - specifically time data
library(glmmTMB) # running models
library(performance) # model validation
library(DHARMa) # model validation
library(MuMIn) # model validation
library(modelr) # model validation
library(car) # used for Anova function
library(emmeans) # post-hoc analysis
library(kableExtra) # creating tables
library(vtable) # creating tables
library(ggplot2) # plotting figures
library(ggeffects) # plotting models/model validation
library(sjPlot) # plotting models 
library(ggpubr) # plotting figures
library(broom) # dependent

Results

Metabolic rates

Resting oxygen consumption

Scenario

For details on the experiment performed please read the information at the top of this document. In brief, Acanthochromis polyacanthus from two different regions on the Great Barrier Reef (GBR) were tested for metabolic performance at four different temperatures, 27\(^\circ\)C, 28.5\(^\circ\)C, 30\(^\circ\)C, and 31.5\(^\circ\)C. Fish used in this study were collected from two different regions, low- (i.e. Cairns) and high-latitude (i.e., Mackay), within each region fish were collected from a total of three different populations. Individuals were tested at each temperature, resting oxygen consumption, maximum oxygen consumption. Absolute aerboic scope was calculated by using the following formula:

Absolute aerobic scope = (maximum oxygen consumption - resting oxygen consumption)

Individuals were first tested at 27\(^\circ\)C. Water temperature was then increased at a rate of 0.5\(^\circ\)C Day^-1 until the next temperature was reached. Fish were then provided with an additional 5 day to adjust to the new temperature before aerobic physiology was tested again.

Three traits are included within the aerobic physiology analysis, resting oxygen consumption, maximum oxygen consumption, and absoulte aerboic scope. Data for each metric was collect from respiratory experiments that had data recorded via a combination of programs including, AquaResp and PyroScience. Slopes (i.e., resting and maximum oxygen consumption values) were then calculated via the RespR [https://januarharianto.github.io/respR/articles/respR.html] package.

Read in the data

Before beginning always make sure that you are working in the correct directory

knitr::opts_knit$set(root.dir=working.dir)

Now we can import that data. Replace import data with the PATH to your data file. I have secretly labelled my PATH import.data (i.e. import.data = “PATH TO MY FILE”)

Load data

resp <- import.data

Data manipulation

Before the data can be analysed it is important to clean up the data file. Below a number of adjustments are made, primarily making sure that columns are being treated appropriately as either factors, numeric, or as time, as well as the renaming of some columns. Once these changes are made the data is being saved into a new dataframe called resp2

resp2 = resp %>% 
  dplyr::rename(EXP_FISH_ID = FISH_ID) %>%
  separate(EXP_FISH_ID, c("FISH_ID"), remove = FALSE) %>%
  mutate(FISH_ID = factor(FISH_ID), 
         POPULATION = factor(POPULATION), 
         REGION = factor(REGION), 
         TEMPERATURE = as.numeric(TEMPERATURE), #run with temperature as a factor
         RESTING_DATE = factor(RESTING_DATE), 
         RESTING_CHAMBER = factor(RESTING_CHAMBER), 
         RESTING_SYSTEM = factor(RESTING_SYSTEM), 
         RESTING_SUMP = factor(RESTING_SUMP), 
         RESTING_AM_PM = factor(RESTING_AM_PM), 
         RESTING_START_TIME = hms(RESTING_START_TIME),
         RESTING_END_TIME = hms(RESTING_ENDTIME),
         MAX_DATE = factor(MAX_DATE), 
         MAX_CHAMBER = factor(MAX_CHAMBER), 
         MAX_SYSTEM = factor(MAX_SYSTEM), 
         MAX_SUMP = factor(MAX_SUMP), 
         MAX_AM_PM = factor(MAX_AM_PM), 
         MAX_START_TIME = hms(MAX_START_TIME), 
         Swim.performance = factor(Swim.performance), 
         NAS = as.numeric(NAS), 
         FAS = as.numeric(FAS), 
         MgO2.hr_Net = as.numeric(MgO2.hr_Net), 
         RESTING_RUNTIME_SECONDS = as.numeric(hms(RESTING_RUNTIME))) %>% 
  dplyr::rename(MASS = WEIGHT) %>% 
  mutate(MASS_CENTERED = scale(MASS, scale = FALSE, center = TRUE))

Next select data points will be removed. Beside the removed data points I have provided reasoning for their exclusion, such as fish died during the experiment, or data quailty was poor - which likely indicated that there was an issue with the equipment during the trial.

resp3 <- resp2 %>% 
  subset(  
    EXP_FISH_ID !="LCHA127_27" & # deceased during experiment
      EXP_FISH_ID !="LCHA132_27" & # deceased during experiment
      EXP_FISH_ID !="LKES168_27" # poor data quality
  ) 

Great! That is everything for data manipulation

Exploratory data analysis

Mass v Rest
ggplot(resp3, aes(MASS, RESTING_MgO2.hr_RESPR)) + 
  geom_point() + 
  geom_smooth(method = "lm") + 
  theme_classic()

Mass v REST (LATITUDE)
ggplot(resp3, aes(MASS, RESTING_MgO2.hr_RESPR, color = REGION)) + 
  geom_point() +
  theme_classic() + 
  geom_smooth(method = "lm")

TEMPERTURE v REST (LATITUDE)
ggplot(resp3, aes(TEMPERATURE, RESTING_MgO2.hr_RESPR, color = REGION)) + 
  geom_point() +
  theme_classic()

Fit the model

The model was fit using the glm and later glmmTMB package in R. A number of different models were tested to determine which hypothesis and associated variables best predicted resting oxygen consumption. Model fit was examined using AICc, BIC, and r-squared values. Additional model were examined via the validation diagonistics provided by the performance and dHARMA packages in R.

The first set of models tested looked at three different hypotheses including 1) that mass has a major impact of resting oxygen consumption of fish (this has been documented in the literature), 2) if variables related to time have an important impact on the resting oxygen consumption of fish.

Fixed factors (linear regression models)
model 1
#--- base model ---#
rmr.1 <- glm(RESTING_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + MASS_CENTERED, 
                 family=gaussian(),
                 data = resp3) 

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.9071699 1.8396670 0.4931164 0.6225248
REGIONLeading -1.0631773 2.7553334 -0.3858616 0.7000498
TEMPERATURE 0.1759916 0.0632351 2.7831307 0.0059520
MASS_CENTERED 133.8254357 9.9843605 13.4035060 0.0000000
REGIONLeading:TEMPERATURE 0.0376457 0.0946023 0.3979365 0.6911434
model 2
#--- experimental rmr equipment hypothesis ---#
rmr.2 <- glm(RESTING_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + RESTING_SUMP + 
                   RESTING_AM_PM + RESTING_RUNTIME_SECONDS, 
                 family=gaussian(),
                 data = resp3) 

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0417266 2.5593818 0.0163034 0.9870104
REGIONLeading -1.8177666 3.7681132 -0.4824077 0.6301025
TEMPERATURE 0.3212288 0.0949708 3.3823948 0.0008816
RESTING_SUMP2 -0.0016294 0.2215833 -0.0073533 0.9941411
RESTING_AM_PMPM -0.4196626 0.2161103 -1.9418904 0.0537114
RESTING_RUNTIME_SECONDS -0.0001616 0.0000513 -3.1477047 0.0019264
REGIONLeading:TEMPERATURE 0.0125324 0.1293602 0.0968801 0.9229294
model comparison table
model df AICc BIC r2
rmr.1 6 561.9094 580.8294 0.6058512
rmr.2 8 680.4894 705.5293 0.2768734

The model that contains MASS_CENTERED seems to do better than the model that incorporates variables that are associated with the time that experiments are performed.This is demonstrated by the lower AIC and BIC scores, as well as higher r-squared value. However, RESTING_RUNTIME_SECONDS was a significant variable in model 2. Let’s see what a third model looks like if we both MASS_CENTERED and RESTING_RUNTIME_SECONDS.

model 3
rmr.3 <- glm(RESTING_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + RESTING_RUNTIME_SECONDS, 
                 family=gaussian(),
                 data = resp3)
model comparison table 2
model df AICc BIC r2
rmr.1 6 561.9094 580.8294 0.6058512
rmr.2 8 680.4894 705.5293 0.2768734
rmr.3 7 544.9015 566.8936 0.6426906

It looks like the third model is better than the previous two. Next we will test to see if the variable temperature performs best as a 1^st (linear), 2^nd (quadratic), or 3^rd (cubic) order polynomial. As the relationship between temperature and resting oxygen consumption is predicted to be non-linear.

Polynomials
polynomial models

Note that the linear model has already been created via model rmr.3 in the previous section.

rmr.3.p2 <- glm(RESTING_MgO2.hr_RESPR ~ 1+ REGION * poly(TEMPERATURE, 2) + MASS_CENTERED + RESTING_RUNTIME_SECONDS, 
                 family=gaussian(),
                 data = resp3)  

rmr.3.p3 <- glm(RESTING_MgO2.hr_RESPR ~ 1+ REGION * poly(TEMPERATURE, 3) + MASS_CENTERED + RESTING_RUNTIME_SECONDS, 
                 family=gaussian(),
                 data = resp3)

polynomial model comparisons

model df AICc BIC r2
rmr.3 7 544.9015 566.8936 0.6489236
rmr.3.p2 9 545.1143 573.1773 0.6566813
rmr.3.p3 11 548.8463 582.8799 0.6580731

From our model comparison we can see the there is no additional benefit to the model by including temperature as a 2^nd or 3^rd order polynomial. However, the linear and quadratic model both perform well.

Random factors

Fish were repeatedly sampled over four different temperatures, therefore repeated sampling needs to be accounted for. To do this random factors will be included within the model. There are a number of options that can be used for random factors including 1) accounting for repeated sampling of individuals, 2) accounting for repeated sampling of individuals nested within population, 3) account for repeated sampling of individuals and populations without nesting. All three models will be run and compared.

random factor models
rmr.3a <- glmmTMB(RESTING_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|FISH_ID), 
                  family=gaussian(),
                  data = resp3,
                  REML = TRUE) 


rmr.3b <- glmmTMB(RESTING_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|POPULATION/FISH_ID), 
                  family=gaussian(),
                  data = resp3,
                  REML = TRUE)

rmr.3c <- glmmTMB(RESTING_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|FISH_ID) + (1|POPULATION), 
                  family=gaussian(),
                  data = resp3,
                  REML = TRUE)

random factor model comparisons

model df AICc BIC r2m r2c
rmr.3a 8 559.0743 584.1142 0.642662 0.7293407
rmr.3b 9 561.2823 589.3453 0.642662 0.7293407
rmr.3c 9 561.2823 589.3453 0.642662 0.7293407

Model rmr.3a appears to be the best model, however, there seems to be no difference in how the models change depending on how the random factors are arranged.

Model validation

performance
rmr.3a (linear)
rmr.3a %>% performance::check_model(detrend = FALSE)

The rmr.3a model performs well, however, in the model validation performed by the performance model it looks like there are two variables that are highly correlated. If we expand the figure we can see that the highly correlated variables are REGION and REGION:TEMPERATURE. Perhaps this is unsurprising but lets see what happens when we run the quadratic (2^nd polynomial) model to see if this helps deal with the high correlation between these two variables, as it performed very similarly to rmr.3a, and even had a higher r2 value.

rmr.3.p2a (quadratic)

First we need to update the model by adding in the missing random factor

rmr.3.p2a <- glmmTMB(RESTING_MgO2.hr_RESPR ~ 1+ REGION * poly(TEMPERATURE, 2) + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|FISH_ID), 
                 family=gaussian(),
                 data = resp3, 
                 REML = TRUE) 

DHARMa residuals
rmr.3a (linear)
rmr.3a %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.516 0.716 0.216 0.896 0.868 0.824 0.736 0.764 0.756 0.852 0.384 0.136 0.128 0.1 0 0.344 0.184 0.336 0.324 0.016 ...
rmr.3a %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.042545, p-value = 0.8874
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.97177, p-value = 0.792
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 187, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001353802 0.0294332214
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.005347594
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.042545, p-value = 0.8874
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.97177, p-value = 0.792
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 187, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001353802 0.0294332214
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.005347594
rmr.3.p2a (quadratic)

First we need to update the model by adding in the missing random factor

rmr.3.p2a <- glmmTMB(RESTING_MgO2.hr_RESPR ~ 1+ REGION * poly(TEMPERATURE, 2) + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|FISH_ID), 
                 family=gaussian(),
                 data = resp3, 
                 REML = TRUE) 
rmr.3.p2a %>% simulateResiduals(plot=TRUE) 

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.444 0.784 0.164 0.932 0.884 0.764 0.688 0.852 0.792 0.824 0.308 0.168 0.16 0.06 0 0.408 0.212 0.272 0.272 0.036 ...
rmr.3.p2a %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.04477, p-value = 0.8477
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.96383, p-value = 0.76
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 187, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001353802 0.0294332214
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.005347594
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.04477, p-value = 0.8477
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.96383, p-value = 0.76
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 187, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001353802 0.0294332214
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.005347594

It looks like the model that treats temperature as a second order polynomial does a better job at avoiding high levels of collinearity within the model. The quadratic model will be used moving forward because it:

  • The quadratic model performs just as well as the linear model based on the model validation scores (i.e., AIC, BIC, and r2)
  • The quadratic model does a better job at dealing with collinearity that appeared in the model

Partial plots

ggemmeans

plot_model

Model investigation

summary
Estimate StdError Zvalue Pvalue
(Intercept) 8.3165021 0.5180504 16.0534626 0.0000000
REGIONLeading 0.0172115 0.2341284 0.0735129 0.9413980
poly(TEMPERATURE, 2)1 6.5820041 1.3078841 5.0325592 0.0000005
poly(TEMPERATURE, 2)2 2.7374223 1.1855357 2.3090171 0.0209426
MASS_CENTERED 133.3083220 12.2781446 10.8573670 0.0000000
RESTING_RUNTIME_SECONDS -0.0001467 0.0000321 -4.5683146 0.0000049
REGIONLeading:poly(TEMPERATURE, 2)1 0.8398236 1.7827144 0.4710926 0.6375746
REGIONLeading:poly(TEMPERATURE, 2)2 -2.2347739 1.7668476 -1.2648368 0.2059298
Anova
Chisq Df Pr(>Chisq)
REGION 0.003263 1 0.9544475
poly(TEMPERATURE, 2) 51.567789 2 0.0000000
MASS_CENTERED 117.882419 1 0.0000000
RESTING_RUNTIME_SECONDS 20.869498 1 0.0000049
REGION:poly(TEMPERATURE, 2) 1.823912 2 0.4017376
confint
2.5 % 97.5 % Estimate
(Intercept) 7.3011420 9.3318621 8.3165021
REGIONLeading -0.4416717 0.4760947 0.0172115
poly(TEMPERATURE, 2)1 4.0185984 9.1454098 6.5820041
poly(TEMPERATURE, 2)2 0.4138150 5.0610297 2.7374223
MASS_CENTERED 109.2436009 157.3730432 133.3083220
RESTING_RUNTIME_SECONDS -0.0002096 -0.0000837 -0.0001467
REGIONLeading:poly(TEMPERATURE, 2)1 -2.6542324 4.3338797 0.8398236
REGIONLeading:poly(TEMPERATURE, 2)2 -5.6977316 1.2281839 -2.2347739
Std.Dev.(Intercept)|FISH_ID 0.3510707 0.7291319 0.5059415
r-squared
R2_conditional R2_marginal optional
0.7370331 0.6487262 FALSE

Pairwise comparisons

emtrends [latitudes]
rmr.3.p2a %>% emtrends(var = "TEMPERATURE", type = "response") %>% pairs(by = "TEMPERATURE") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)

SCROLL TO THE RIGHT –>

The numbers in the left most column in the table just mention that the slopes are assuming mean MASS_CENTERED and RESTING_TIME_SEONDS values when looking at differences between latitudinal slopes.

emmeans [latitudes]
rmr.3.p2a %>% emmeans(pairwise ~ TEMPERATURE*REGION, type = "response") %>% pairs(by = "TEMPERATURE") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)
temperature
rmr.3.p2a %>% emmeans(~ TEMPERATURE*REGION, type = "response")  %>% summary(infer=TRUE)
Means - f(temperature)
rmr.3.p2a %>% update(.~1+ REGION * as.factor(TEMPERATURE) + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|FISH_ID)) %>% 
  emmeans(~REGION*TEMPERATURE, type = "response") %>% summary(infer=TRUE)
Abs. diff - f(temperature)
rmr.3.p2a %>% update(.~1+ REGION * as.factor(TEMPERATURE) + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|FISH_ID)) %>% 
  emmeans(~REGION*TEMPERATURE, type = "response") %>% pairs(by ="REGION") %>% summary(infer=TRUE)
Effect size
rmr.emm <- rmr.3.p2a %>% emmeans(~REGION*TEMPERATURE)
eff_size(rmr.emm, sigma = sigma(rmr.3.p2a), edf=df.residual(rmr.3.p2a))
##  contrast                                                              
##  Core TEMPERATURE29.0855614973262 - Leading TEMPERATURE29.0855614973262
##  effect.size    SE  df lower.CL upper.CL
##       -0.249 0.324 185   -0.889    0.391
## 
## sigma used for effect sizes: 0.8731 
## Confidence level used: 0.95

Summary figure

Conclusion

  • In conclusion while resting oxygen consumption is significantly positively correlated with temperature. However, there is no significant difference in the resting oxygen consumption between the low- and high-latitude regions.

Maximum oxygen consumption

Scenario

For details on the experiment performed please read the information at the top of this document. In brief, Acanthochromis polyacanthus from two different regions on the Great Barrier Reef (GBR) were tested for metabolic performance at four different temperatures, 27\(^\circ\)C, 28.5\(^\circ\)C, 30\(^\circ\)C, and 31.5\(^\circ\)C. Fish used in this study were collected from two different regions, low- (i.e. Cairns) and high-latitude (i.e., Mackay), within each region fish were collected from a total of three different populations. Individuals were tested at each temperature, resting oxygen consumption, maximum oxygen consumption. Absolute aerboic scope was calculated by using the following formula:

Absolute aerobic scope = (maximum oxygen consumption - resting oxygen consumption)

Individuals were first tested at 27\(^\circ\)C. Water temperature was then increased at a rate of 0.5\(^\circ\)C Day^-1 until the next temperature was reached. Fish were then provided with an additional 5 day to adjust to the new temperature before aerobic physiology was tested again.

Three traits are included within the aerobic physiology analysis, resting oxygen consumption, maximum oxygen consumption, and absoulte aerboic scope. Data for each metric was collect from respiratory experiments that had data recorded via a combination of programs including, AquaResp and PyroScience. Slopes (i.e., resting and maximum oxygen consumption values) were then calculated via the RespR [https://januarharianto.github.io/respR/articles/respR.html] package.

Read in the data

Before beginning always make sure that you are working in the correct directory

knitr::opts_knit$set(root.dir=working.dir)

Now we can import that data. Replace import data with the PATH to your data file. I have secretly labelled my PATH import.data (i.e. import.data = “PATH TO MY FILE”)

Load data

resp <- import.data

Data manipulation

Before the data can be analysed it is important to clean up the data file. Below a number of adjustments are made, primarily making sure that columns are being treated appropriately as either factors, numeric, or as time, as well as the renaming of some columns. Once these changes are made the data is being saved into a new dataframe called resp2

resp2 = resp %>% 
  dplyr::rename(EXP_FISH_ID = FISH_ID) %>%
  separate(EXP_FISH_ID, c("FISH_ID"), remove = FALSE) %>%
  mutate(FISH_ID = factor(FISH_ID), 
         POPULATION = factor(POPULATION), 
         REGION = factor(REGION), 
         TEMPERATURE = as.numeric(TEMPERATURE), #run with temperature as a factor
         RESTING_DATE = factor(RESTING_DATE), 
         RESTING_CHAMBER = factor(RESTING_CHAMBER), 
         RESTING_SYSTEM = factor(RESTING_SYSTEM), 
         RESTING_SUMP = factor(RESTING_SUMP), 
         RESTING_AM_PM = factor(RESTING_AM_PM), 
         RESTING_START_TIME = hms(RESTING_START_TIME),
         RESTING_END_TIME = hms(RESTING_ENDTIME),
         MAX_DATE = factor(MAX_DATE), 
         MAX_CHAMBER = factor(MAX_CHAMBER), 
         MAX_SYSTEM = factor(MAX_SYSTEM), 
         MAX_SUMP = factor(MAX_SUMP), 
         MAX_AM_PM = factor(MAX_AM_PM), 
         MAX_START_TIME = hms(MAX_START_TIME), 
         Swim.performance = factor(Swim.performance), 
         NAS = as.numeric(NAS), 
         FAS = as.numeric(FAS), 
         MgO2.hr_Net = as.numeric(MgO2.hr_Net), 
         RESTING_RUNTIME_SECONDS = as.numeric(hms(RESTING_RUNTIME))) %>% 
  dplyr::rename(MASS = WEIGHT) %>% 
  mutate(MASS_CENTERED = scale(MASS, scale = FALSE, center = TRUE))

Next select data points will be removed. Beside the removed data points I have provided reasoning for their exclusion, such as fish died during the experiment, or data quailty was poor - which likely indicated that there was an issue with the equipment during the trial.

resp3 <- resp2 %>% 
  subset(  
    EXP_FISH_ID !="LCHA127_27" & # deceased during experiment
      EXP_FISH_ID !="LCHA132_27" & # deceased during experiment
      EXP_FISH_ID !="LKES168_27" # poor data quality
  ) 

So far the analysis has been the same as the protocol outlined in the resting oxygen consumption data. One additional data removal step will take place in the maximum oxygen consumption analysis for samples where fish swam poorly and therefore their maximum oxygen consumption data is thought to be unreliable. This step is done before any data analysis has taken place.

resp4 <- resp3 %>% 
  subset(
    EXP_FISH_ID !="CSUD008_27" &  # poor swim
      EXP_FISH_ID !="CSUD008_30" &  # poor swim 
      EXP_FISH_ID !="CSUD008_28.5" & # poor swim
      EXP_FISH_ID !="CSUD018_31.5" & # poor swim 
      EXP_FISH_ID !="CSUD026_30" & # max. value low 
      EXP_FISH_ID !="CSUD074_28.5" & # fas value low 
      EXP_FISH_ID !="CSUD079_30" &
      EXP_FISH_ID !="CVLA052_27" & #nas value low 
      EXP_FISH_ID !="CVLA054_28.5" & # low max value? 
      EXP_FISH_ID !="LCHA113_27" & # poor data quality 
      EXP_FISH_ID !="LCHA113_30" & # poor swim 
      EXP_FISH_ID !="LCHA127_27" # deceased during experiment
  ) 

Exploratory data analysis

Mass v Max
ggplot(resp4, aes(MASS, MAX_MgO2.hr_RESPR)) + 
  geom_point() + 
  geom_smooth(method = "lm") + 
  theme_classic()

Mass v Max (LATITUDE)
ggplot(resp4, aes(MASS, MAX_MgO2.hr_RESPR, color = REGION)) + 
  geom_point() +
  theme_classic() + 
  geom_smooth(method = "lm")

TEMPERTURE v Max (LATITUDE)
ggplot(resp4, aes(TEMPERATURE, MAX_MgO2.hr_RESPR, color = REGION)) + 
  geom_point() +
  theme_classic()

Fit the model

The model was fit using the glm and later glmmTMB package in R. A number of different models were tested to determine which hypothesis and associated variables best predicted resting oxygen consumption. Model fit was examined using AICc, BIC, and r-squared values. Additional model were examined via the validation diagnostics provided by the performance and dHARMA packages in R.

Fixed factors (linear regression models)
model 1
#--- base model ---#
mmr.1 <- glm(MAX_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + MASS_CENTERED, 
                 family=gaussian(),
                 data = resp4)  

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.5748341 4.4522838 -1.027525 0.3056245
REGIONLeading 17.6468408 6.5852635 2.679747 0.0080880
TEMPERATURE 0.6944978 0.1530872 4.536615 0.0000107
MASS_CENTERED 298.1596890 24.1436460 12.349406 0.0000000
REGIONLeading:TEMPERATURE -0.6609056 0.2261837 -2.921986 0.0039482
model 2
#--- experimental rmr equipment hypothesis ---#
mmr.2 <- glm(MAX_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + MAX_SUMP + MAX_CHAMBER + 
                   MAX_AM_PM, 
                 family=gaussian(),
                 data = resp4) 

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.2242856 6.4786135 -0.9607435 0.3380701
REGIONLeading 16.0665387 9.1812310 1.7499330 0.0819666
TEMPERATURE 0.7876835 0.2182937 3.6083665 0.0004069
MAX_SUMP2 -0.1739027 0.5317501 -0.3270383 0.7440484
MAX_CHAMBER2 0.9203132 0.7549758 1.2189970 0.2245644
MAX_CHAMBER3 1.1525268 0.7667736 1.5030861 0.1347055
MAX_CHAMBER4 0.5061699 0.8313336 0.6088649 0.5434413
MAX_AM_PMPM -0.4757712 0.5382845 -0.8838656 0.3780394
REGIONLeading:TEMPERATURE -0.7177722 0.3148211 -2.2799368 0.0238764
model comparison table
model df AICc BIC r2
mmr.1 6 828.4562 846.9821 0.6622066
mmr.2 10 945.6551 976.0266 0.3732903

The model that contains MASS_CENTERED seems to do better than the model that incorporates variables that are associated with the time that experiments are performed.This is demonstrated by the lower AIC and BIC scores, as well as higher r-squared value.

Polynomials
polynomial models

Note that the linear model has already been created via model mmr.1 in the previous section.

mmr.1.p2 <- glm(MAX_MgO2.hr_RESPR ~ 1+ REGION * poly(TEMPERATURE, 2) + MASS_CENTERED, 
                 family=gaussian(),
                 data = resp4)

mmr.1.p3 <- glm(MAX_MgO2.hr_RESPR ~ 1+ REGION * poly(TEMPERATURE, 3) + MASS_CENTERED, 
                 family=gaussian(),
                 data = resp4)

polynomial model comparisons

model df AICc BIC r2
mmr.1 6 828.4562 846.9821 0.6673593
mmr.1.p2 8 832.3856 856.8872 0.6681820
mmr.1.p3 10 836.8419 867.2134 0.6682098

From our model comparison we can see the there is no additional benefit to the model by including temperature as a 2nd or 3rd order polynomial. However, the linear and quadratic model both perform well.

Random factors

Fish were repeatedly sampled over four different temperatures, therefore repeated sampling needs to be accounted for. To do this random factors will be included within the model. There are a number of options that can be used for random factors including 1) accounting for repeated sampling of individuals, 2) accounting for repeated sampling of individuals nested within population, 3) account for repeated sampling of individuals and populations without nesting. All three models will be run a compared.

random factor models
mmr.1a <- glmmTMB(MAX_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + (1|FISH_ID), 
                  family=gaussian(),
                  data = resp4,
                  REML = TRUE) 

mmr.1b <- glmmTMB(MAX_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + (1|POPULATION/FISH_ID), 
                  family=gaussian(),
                  data = resp4,
                  REML = TRUE)

mmr.1c <- glmmTMB(MAX_MgO2.hr_RESPR ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + (1|FISH_ID) + (REGION|POPULATION), 
                  family=gaussian(),
                  data = resp4,
                  REML = TRUE) # Convergence problem

random factor model comparisons

model df AICc BIC r2m r2c
mmr.1a 7 809.5847 831.1115 0.6693399 0.7836732
mmr.1b 8 811.5475 836.0491 0.6694043 0.7848247
mmr.1c 10 NA NA 0.6697666 0.7848011

Model mmr.1a appears to be the best model, however, there seems to be little difference in how the models change depending on how the random factors are arranged.

Model validation

performance
rmr.3a (linear)

The mmr.1a model performs well, however, in the model validation performed by the performance model it looks like there are two variables that are highly correlated. If we expand the figure we can see that the highly correlated variables are REGION and REGION:TEMPERATURE. Perhaps this is unsurprising but lets see what happens when we run the quadratic (2nd polynomial) model to see if this helps deal with the high correlation between these two variables, as it performed very similarly to mmr.1a, and even had a higher r2 value.

mmr.1.p2a (quadratic)

First we need to update the model by adding in the missing random factor

mmr.1.p2a <- glmmTMB(MAX_MgO2.hr_RESPR ~ 1+ REGION * poly(TEMPERATURE, 2) + MASS_CENTERED + (1|FISH_ID), 
                 family=gaussian(),
                 data = resp4, 
                 REML = TRUE) 

DHARMa residuals
mmr.1a (linear)
mmr.1a %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.212 0.328 0.456 0.744 0.812 0.852 1 0.7 0.372 0.684 0.692 0.032 0.216 0.204 0.612 0.084 0.452 0.988 0.936 0.848 ...
mmr.1a %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.0715, p-value = 0.3293
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.94899, p-value = 0.68
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 3, observations = 176, p-value = 0.1665
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.003529072 0.049003686
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01704545
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.0715, p-value = 0.3293
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.94899, p-value = 0.68
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 3, observations = 176, p-value = 0.1665
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.003529072 0.049003686
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01704545
mmr.1.p2 (quadratic)

First we need to update the model by adding in the missing random factor

mmr.1.p2a <- glmmTMB(MAX_MgO2.hr_RESPR ~ 1+ REGION * poly(TEMPERATURE, 2) + MASS_CENTERED + (1|FISH_ID), 
                 family=gaussian(),
                 data = resp4, 
                 REML = TRUE) 
mmr.1.p2a %>% simulateResiduals(plot=TRUE) 

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.228 0.308 0.492 0.728 0.788 0.876 1 0.728 0.356 0.664 0.724 0.04 0.188 0.192 0.644 0.092 0.42 0.98 0.936 0.836 ...
mmr.1.p2a %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.076136, p-value = 0.2594
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.94042, p-value = 0.608
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 3, observations = 176, p-value = 0.1665
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.003529072 0.049003686
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01704545
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.076136, p-value = 0.2594
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.94042, p-value = 0.608
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 3, observations = 176, p-value = 0.1665
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.003529072 0.049003686
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01704545

It looks like the model that treats temperature as a second order polynomial does a better job at avoiding high levels of collinearity within the model. The quadratic model will be used moving forward because it:

  • The quadratic model performs just as well as the linear model based on the model validation scores (i.e., AIC, BIC, and r2)
  • The quadratic model does a better job at dealing with collinearity that appeared in the model

Partial plots

ggemmeans

plot_model

Model investigation

summary
Estimate StdError Zvalue Pvalue
(Intercept) 15.665104 0.3860064 40.5824948 0.0000000
REGIONLeading -1.540213 0.6378566 -2.4146702 0.0157495
poly(TEMPERATURE, 2)1 14.695118 2.8646793 5.1297602 0.0000003
poly(TEMPERATURE, 2)2 -2.507947 2.8404686 -0.8829342 0.3772718
MASS_CENTERED 313.864413 34.0009493 9.2310485 0.0000000
REGIONLeading:poly(TEMPERATURE, 2)1 -13.979779 4.2142722 -3.3172464 0.0009091
REGIONLeading:poly(TEMPERATURE, 2)2 1.659481 4.1665787 0.3982838 0.6904210
Anova
Chisq Df Pr(>Chisq)
REGION 5.244328 1 0.0220184
poly(TEMPERATURE, 2) 16.284260 2 0.0002910
MASS_CENTERED 85.212257 1 0.0000000
REGION:poly(TEMPERATURE, 2) 11.182851 2 0.0037297
confint
2.5 % 97.5 % Estimate
(Intercept) 14.908545 16.4216626 15.665104
REGIONLeading -2.790389 -0.2900373 -1.540213
poly(TEMPERATURE, 2)1 9.080450 20.3097864 14.695118
poly(TEMPERATURE, 2)2 -8.075163 3.0592692 -2.507947
MASS_CENTERED 247.223777 380.5050495 313.864413
REGIONLeading:poly(TEMPERATURE, 2)1 -22.239601 -5.7199575 -13.979779
REGIONLeading:poly(TEMPERATURE, 2)2 -6.506863 9.8258252 1.659481
Std.Dev.(Intercept)|FISH_ID 1.054358 2.1168357 1.493956
r-squared
R2_conditional R2_marginal optional
0.7829506 0.6683685 FALSE

Pairwise comparisons

emtrends [latitudes]
mmr.1.p2a %>% emtrends(var = "TEMPERATURE", type = "response") %>% pairs(by = "TEMPERATURE") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)

SCROLL TO THE RIGHT –>

The numbers in the left most column in the table just mention that the slopes are assuming mean MASS_CENTERED values when looking at differences between latitudinal slopes.

emmeans [latitudes]
mmr.1.p2a %>% emmeans(pairwise ~ TEMPERATURE*REGION, type = "response") %>% pairs(by = "TEMPERATURE") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)
temperature
mmr.1.p2a %>% emmeans(~ TEMPERATURE*REGION, type = "response")  %>% summary(infer=TRUE)
Means - f(temperature)
mmr.1.p2a %>% update(.~1+ REGION * as.factor(TEMPERATURE) + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|FISH_ID)) %>% 
  emmeans(~REGION*TEMPERATURE, type = "response") %>% summary(infer=TRUE)
Abs. diff - f(temperature)
mmr.1.p2a %>% update(.~1+ REGION * as.factor(TEMPERATURE) + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|FISH_ID)) %>% 
  emmeans(~REGION*TEMPERATURE, type = "response") %>% pairs(by ="REGION") %>% summary(infer=TRUE)
Effect size
mmr.emm <- mmr.1.p2a %>% emmeans(~REGION*TEMPERATURE)
eff_size(mmr.emm, sigma = sigma(mmr.1.p2a), edf=df.residual(mmr.1.p2a))
##  contrast                                                              
##  Core TEMPERATURE29.0965909090909 - Leading TEMPERATURE29.0965909090909
##  effect.size    SE  df lower.CL upper.CL
##        0.825 0.364 174    0.106     1.54
## 
## sigma used for effect sizes: 2.056 
## Confidence level used: 0.95

Summary figure

Conclusion

  • In conclusion while maximum oxygen consumption is significantly positively correlated with temperature and fish from low latitudes have significantly higher maximum consumption at elevated temperatures compared to fish from high latitudes.

Absolute aerobic scope

Scenario

For details on the experiment performed please read the information at the top of this document. In brief, Acanthochromis polyacanthus from two different regions on the Great Barrier Reef (GBR) were tested for metabolic performance at four different temperatures, 27\(^\circ\)C, 28.5\(^\circ\)C, 30\(^\circ\)C, and 31.5\(^\circ\)C. Fish used in this study were collected from two different regions, low- (i.e. Cairns) and high-latitude (i.e., Mackay), within each region fish were collected from a total of three different populations. Individuals were tested at each temperature, resting oxygen consumption, maximum oxygen consumption. Absolute aerboic scope was calculated by using the following formula:

Absolute aerobic scope = (maximum oxygen consumption - resting oxygen consumption)

Individuals were first tested at 27\(^\circ\)C. Water temperature was then increased at a rate of 0.5\(^\circ\)C Day^-1 until the next temperature was reached. Fish were then provided with an additional 5 day to adjust to the new temperature before aerobic physiology was tested again.

Three traits are included within the aerobic physiology analysis, resting oxygen consumption, maximum oxygen consumption, and absolute aerobic scope. Data for each metric was collect from respiratory experiments that had data recorded via a combination of programs including, AquaResp3 and PyroScience. Slopes (i.e., resting and maximum oxygen consumption values) were then calculated via the RespR [https://januarharianto.github.io/respR/articles/respR.html] package.

Note: Absolute aerobic scope is sometime called net aerobic scope. When making the models labeling was done using ‘net aerobic scope’ (i.e., nas).

Read in the data

Before beginning always make sure that you are working in the correct directory

knitr::opts_knit$set(root.dir=working.dir)

Now we can import that data. Replace import data with the PATH to your data file. I have secretly labelled my PATH import.data (i.e. import.data = “PATH TO MY FILE”)

Load data
resp <- import.data

Data manipulation

Before the data can be analysed it is important to clean up the data file. Below a number of adjustments are made, primarily making sure that columns are being treated appropriately as either factors, numeric, or as time, as well as the renaming of some columns. Once these changes are made the data is being saved into a new dataframe called resp2

resp2 = resp %>% 
  dplyr::rename(EXP_FISH_ID = FISH_ID) %>%
  separate(EXP_FISH_ID, c("FISH_ID"), remove = FALSE) %>%
  mutate(FISH_ID = factor(FISH_ID), 
         POPULATION = factor(POPULATION), 
         REGION = factor(REGION), 
         TEMPERATURE = as.numeric(TEMPERATURE), #run with temperature as a factor
         RESTING_DATE = factor(RESTING_DATE), 
         RESTING_CHAMBER = factor(RESTING_CHAMBER), 
         RESTING_SYSTEM = factor(RESTING_SYSTEM), 
         RESTING_SUMP = factor(RESTING_SUMP), 
         RESTING_AM_PM = factor(RESTING_AM_PM), 
         RESTING_START_TIME = hms(RESTING_START_TIME),
         RESTING_END_TIME = hms(RESTING_ENDTIME),
         MAX_DATE = factor(MAX_DATE), 
         MAX_CHAMBER = factor(MAX_CHAMBER), 
         MAX_SYSTEM = factor(MAX_SYSTEM), 
         MAX_SUMP = factor(MAX_SUMP), 
         MAX_AM_PM = factor(MAX_AM_PM), 
         MAX_START_TIME = hms(MAX_START_TIME), 
         Swim.performance = factor(Swim.performance), 
         NAS = as.numeric(NAS), 
         FAS = as.numeric(FAS), 
         MgO2.hr_Net = as.numeric(MgO2.hr_Net), 
         RESTING_RUNTIME_SECONDS = as.numeric(hms(RESTING_RUNTIME))) %>% 
  dplyr::rename(MASS = WEIGHT) %>% 
  mutate(MASS_CENTERED = scale(MASS, scale = FALSE, center = TRUE))

Next select data points will be removed. Beside the removed data points I have provided reasoning for their exclusion, such as fish died during the experiment, or data quailty was poor - which likely indicated that there was an issue with the equipment during the trial.

resp3 <- resp2 %>% 
  subset(  
    EXP_FISH_ID !="LCHA127_27" & # deceased during experiment
      EXP_FISH_ID !="LCHA132_27" & # deceased during experiment
      EXP_FISH_ID !="LKES168_27" # poor data quality
  ) 

So far the analysis has been the same as the protocol outlined in the maximum oxygen consumption data.

resp4 <- resp3 %>% 
  subset(
    EXP_FISH_ID !="CSUD008_27" &  # poor swim
      EXP_FISH_ID !="CSUD008_30" &  # poor swim 
      EXP_FISH_ID !="CSUD008_28.5" & # poor swim
      EXP_FISH_ID !="CSUD018_31.5" & # poor swim 
      EXP_FISH_ID !="CSUD026_30" & # max. value low 
      EXP_FISH_ID !="CSUD074_28.5" & # fas value low 
      EXP_FISH_ID !="CSUD079_30" &
      EXP_FISH_ID !="CVLA052_27" & #nas value low 
      EXP_FISH_ID !="CVLA054_28.5" & # low max value? 
      EXP_FISH_ID !="LCHA113_27" & # poor data quality 
      EXP_FISH_ID !="LCHA113_30" & # poor swim 
      EXP_FISH_ID !="LCHA127_27" # deceased during experiment
  ) 

Exploratory data analysis

Mass v AAS
ggplot(resp4, aes(MASS, MgO2.hr_Net)) + 
  geom_point() + 
  geom_smooth(method = "lm") + 
  theme_classic()

Mass v AAS (LATITUDE)
ggplot(resp4, aes(MASS, MgO2.hr_Net, color = REGION)) + 
  geom_point() +
  theme_classic() + 
  geom_smooth(method = "lm")

TEMPERTURE v AAS (LATITUDE)
ggplot(resp4, aes(TEMPERATURE, MgO2.hr_Net, color = REGION)) + 
  geom_point() +
  theme_classic()

Fit the model

The model was fit using the glm and later glmmTMB package in R. A number of different models were tested to determine which hypothesis and associated variables best predicted resting oxygen consumption. Model fit was examined using AICc, BIC, and r-squared values. Additional model were examined via the validation diagnostics provided by the performance and dHARMA packages in R.

Fixed factors (linear regression models)
model 1
#--- base model ---#
nas.1 <- glm(MgO2.hr_Net ~ 1+ REGION * TEMPERATURE + MASS_CENTERED, 
                 family=gaussian(),
                 data = resp4)  

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.6505476 4.5988210 -1.011248 0.3133266
REGIONLeading 17.8617376 6.8020030 2.625953 0.0094246
TEMPERATURE 0.4921098 0.1581257 3.112143 0.0021773
MASS_CENTERED 167.6469322 24.9382811 6.722473 0.0000000
REGIONLeading:TEMPERATURE -0.6707332 0.2336280 -2.870945 0.0046100
model 2
#--- experimental rmr equipment hypothesis ---#
nas.2 <- glm(MgO2.hr_Net ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + RESTING_SUMP + RESTING_RUNTIME_SECONDS + 
                   RESTING_AM_PM, 
                 family=gaussian(),
                 data = resp4) 

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.1166901 4.6900539 -0.8777490 0.3813335
REGIONLeading 17.7878857 6.8578591 2.5937958 0.0103304
TEMPERATURE 0.4314945 0.1732534 2.4905397 0.0137265
MASS_CENTERED 167.2794598 25.2897560 6.6145146 0.0000000
RESTING_SUMP2 0.0078043 0.4108966 0.0189932 0.9848690
RESTING_RUNTIME_SECONDS 0.0000818 0.0000936 0.8735816 0.3835932
RESTING_AM_PMPM -0.1069842 0.3960028 -0.2701603 0.7873685
REGIONLeading:TEMPERATURE -0.6678804 0.2355090 -2.8359020 0.0051315
model comparison table
model df AICc BIC r2
nas.1 6 839.8550 858.3808 0.440634
nas.2 9 845.5166 872.9666 0.439167

The model that contains MASS_CENTERED seems to do better than the model that incorporates variables that are associated with the time that experiments are performed.This is demonstrated by the lower AIC and BIC scores, as well as higher r-squared value.

Polynomials
polynomial models

Note that the linear model has already been created via model nas.1 in the previous section.

#--- second order polynomial ---# 
nas.1.p2 <- glm(MgO2.hr_Net ~ 1+ REGION * poly(TEMPERATURE,2) + MASS_CENTERED, 
                 family=gaussian(),
                 data = resp4) 

#--- third order polynomial ---#
nas.1.p3 <- glm(MgO2.hr_Net ~ 1+ REGION * poly(TEMPERATURE, 3) + MASS_CENTERED, 
                 family=gaussian(),
                 data = resp4) 

polynomial model comparisons

model df AICc BIC r2
nas.1 6 839.8550 858.3808 0.4463407
nas.1.p2 8 840.4431 864.9447 0.4580961
nas.1.p3 10 844.9023 875.2738 0.4581327

From our model comparison we can see the there is no additional benefit to the model by including temperature as a 2nd or 3rd order polynomial. However, the linear and quadratic model both perform well.

Random factors

Fish were repeatedly sampled over four different temperatures, therefore repeated sampling needs to be accounted for. To do this random factors will be included within the model. There are a number of options that can be used for random factors including 1) accounting for repeated sampling of individuals, 2) accounting for repeated sampling of individuals nested within population, 3) account for repeated sampling of individuals and populations without nesting. All three models will be run a compaired.

random factor models
nas.1a <- glmmTMB(MgO2.hr_Net ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + (1|FISH_ID), 
                  family=gaussian(),
                  data = resp4,
                  REML = TRUE) 

nas.1b <- glmmTMB(MgO2.hr_Net ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + (1|POPULATION/FISH_ID), 
                  family=gaussian(),
                  data = resp4,
                  REML = TRUE)

nas.1c <- glmmTMB(MgO2.hr_Net ~ 1+ REGION * TEMPERATURE + MASS_CENTERED + (1|FISH_ID) + (REGION|POPULATION), 
                  family=gaussian(),
                  data = resp4,
                  REML = TRUE) # convergence problem

random factor model comparisons

model df AICc BIC r2m r2c
nas.1a 7 828.7221 850.2488 0.4525497 0.5901940
nas.1b 8 830.8106 855.3122 0.4542611 0.5928177
nas.1c 10 NA NA 0.4560622 0.5944637

Model nas.1a appears to be the best model, however, there seems to be little difference in how the models change depending on how the random factors are arranged.

Model validation

performance
rmr.3a (linear)

The nas.1a model performs well, however, in the model validation performed by the performance model it looks like there are two variables that are highly correlated. If we expand the figure we can see that the highly correlated variables are REGION and REGION:TEMPERATURE. Perhaps this is unsurprising but lets see what happens when we run the quadratic (2nd polynomial) model to see if this helps deal with the high correlation between these two variables, as it performed very similarly to nas.1a, and even had a higher r2 value.

nas.1.p2a (quadratic)

First we need to update the model by adding in the missing random factor

nas.1.p2a <- glmmTMB(MgO2.hr_Net ~ 1+ REGION * poly(TEMPERATURE,2) + MASS_CENTERED + (1|FISH_ID), 
                 family=gaussian(),
                 data = resp4, 
                 REML=TRUE) 

DHARMa residuals
nas.1a (linear)
nas.1a %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.212 0.244 0.556 0.532 0.74 0.728 1 0.712 0.596 0.868 0.788 0.364 0.28 0.416 0.744 0.124 0.676 0.996 0.788 0.864 ...
nas.1a %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.063636, p-value = 0.4741
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.96215, p-value = 0.744
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 2, observations = 176, p-value = 0.4095
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.001379164 0.040444565
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01136364
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.063636, p-value = 0.4741
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.96215, p-value = 0.744
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 2, observations = 176, p-value = 0.4095
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.001379164 0.040444565
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01136364
nas.1.p2 (quadratic)

First we need to update the model by adding in the missing random factor

nas.1.p2a <- glmmTMB(MgO2.hr_Net ~ 1+ REGION * poly(TEMPERATURE,2) + MASS_CENTERED + (1|FISH_ID), 
                 family=gaussian(),
                 data = resp4, 
                 REML=TRUE)  
nas.1.p2a %>% simulateResiduals(plot=TRUE) 

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.268 0.2 0.648 0.472 0.664 0.8 1 0.756 0.488 0.812 0.844 0.4 0.224 0.352 0.792 0.172 0.596 0.996 0.876 0.828 ...
nas.1.p2a %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.060045, p-value = 0.5497
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.95158, p-value = 0.64
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 2, observations = 176, p-value = 0.4095
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.001379164 0.040444565
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01136364
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.060045, p-value = 0.5497
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.95158, p-value = 0.64
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 2, observations = 176, p-value = 0.4095
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.001379164 0.040444565
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01136364

It looks like the model that treats temperature as a second order polynomial does a better job at avoiding high levels of collinearity within the model. The quadratic model will be used moving forward because it:

  • The quadratic model performs just as well as the linear model based on the model validation scores (i.e., AIC, BIC, and r2)
  • The quadratic model does a better job at dealing with collinearity that appeared in the model

Partial plots

ggemmeans

plot_model

Model investigation

summary
Estimate StdError Zvalue Pvalue
(Intercept) 9.676637 0.3717003 26.033437 0.0000000
REGIONLeading -1.632353 0.6124654 -2.665216 0.0076939
poly(TEMPERATURE, 2)1 10.324322 3.0798654 3.352199 0.0008017
poly(TEMPERATURE, 2)2 -6.546341 3.0573845 -2.141157 0.0322613
MASS_CENTERED 179.593117 32.6469695 5.501066 0.0000000
REGIONLeading:poly(TEMPERATURE, 2)1 -14.500327 4.5325854 -3.199129 0.0013784
REGIONLeading:poly(TEMPERATURE, 2)2 4.897260 4.4894335 1.090841 0.2753427
Anova
Chisq Df Pr(>Chisq)
REGION 6.537818 1 0.0105605
poly(TEMPERATURE, 2) 6.515869 2 0.0384678
MASS_CENTERED 30.261721 1 0.0000000
REGION:poly(TEMPERATURE, 2) 11.460373 2 0.0032465
confint
2.5 % 97.5 % Estimate
(Intercept) 8.9481173 10.4051557 9.676637
REGIONLeading -2.8327627 -0.4319423 -1.632353
poly(TEMPERATURE, 2)1 4.2878974 16.3607477 10.324322
poly(TEMPERATURE, 2)2 -12.5387048 -0.5539778 -6.546341
MASS_CENTERED 115.6062324 243.5800011 179.593117
REGIONLeading:poly(TEMPERATURE, 2)1 -23.3840309 -5.6166225 -14.500327
REGIONLeading:poly(TEMPERATURE, 2)2 -3.9018683 13.6963875 4.897260
Std.Dev.(Intercept)|FISH_ID 0.8790124 1.9917277 1.323160
r-squared
R2_conditional R2_marginal optional
0.6031021 0.4622269 FALSE

Pairwise comparisons

emtrends [latitudes]
nas.1.p2a %>% emtrends(var = "TEMPERATURE", type = "response") %>% pairs(by = "TEMPERATURE") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)

SCROLL TO THE RIGHT –>

The numbers in the left most column in the table just mention that the slopes are assuming mean MASS_CENTERED values when looking at differences between latitudinal slopes.

emmeans [latitudes]
nas.1.p2a %>% emmeans(pairwise ~ TEMPERATURE*REGION, type = "response") %>% pairs(by = "TEMPERATURE") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)
temperature
nas.1.p2a %>% emmeans(~ TEMPERATURE*REGION, type = "response")  %>% summary(infer=TRUE)
Means - f(temperature)
nas.1.p2a %>% update(.~1+ REGION * as.factor(TEMPERATURE) + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|FISH_ID)) %>% 
  emmeans(~REGION*TEMPERATURE, type = "response") %>% summary(infer=TRUE)
Abs. diff - f(temperature)
nas.1.p2a %>% update(.~1+ REGION * as.factor(TEMPERATURE) + MASS_CENTERED + RESTING_RUNTIME_SECONDS + (1|FISH_ID)) %>% 
  emmeans(~REGION*TEMPERATURE, type = "response") %>% pairs(by ="REGION") %>% summary(infer=TRUE)
Effect size
aas.emm <- nas.1.p2a %>% emmeans(~REGION*TEMPERATURE)
eff_size(aas.emm, sigma = sigma(nas.1.p2a), edf=df.residual(nas.1.p2a))
##  contrast                                                              
##  Core TEMPERATURE29.0965909090909 - Leading TEMPERATURE29.0965909090909
##  effect.size    SE  df lower.CL upper.CL
##        0.941 0.336 174    0.278     1.61
## 
## sigma used for effect sizes: 2.221 
## Confidence level used: 0.95

Summary figure

## Warning: Removed 4 rows containing missing values (`geom_point()`).

Conclusion

  • In conclusion while absolute aerobic scope is significantly positively correlated with temperature and fish from low latitudes have significantly higher maximum consumption at elevated temperatures compared to fish from high latitudes.

Enzymes

Lactate dehydrogenase

Scenario

For initial details at the top of this document. In brief, Acanthochromis polyacanthus from two different regions on the Great Barrier Reef (GBR) were tested for metabolic performance at four different temperatures, 27\(^\circ\)C, 28.5\(^\circ\)C, 30\(^\circ\)C, and 31.5\(^\circ\)C. Fish used in this study were collected from two different regions, low- (i.e. Cairns) and high-latitude (i.e., Mackay), within each region fish were collected from a total of three different populations. After metabolic performance was tested blood and tissue samples were collected. White muscle tissue samples were used to look at the relationship between activity and temperature in two different enzymes, Lactate Dehydrogenase (LDH; anaerobic) and Citrate Synthase (CS: aerobic). Enzyme activity was measured over four different temperatures including 20\(^\circ\)C, 30\(^\circ\)C, 40\(^\circ\)C, and 50\(^\circ\)C. Enzyme activity was measured using a spectrometer and wavelength absorption levels were recorded using the software program LabX.

Read in the data

Before beginning always make sure that you are working in the correct directory

knitr::opts_knit$set(root.dir=working.dir)

Now we can import that data. Two different data frames are being imported. The first has all the enzyme wave length absorption data for each sample and the tissue.mass data file contained information pertaining to the tissue samples that was used for each sample. Later on these two data frames will be merged.

Load data

ldh <- read_delim("./enzymes/LDH_LocalAdapt.txt", delim = "\t", 
                  escape_double = FALSE, col_types = cols(`Creation time` = col_datetime(format = "%d/%m/%Y %H:%M")), 
                  trim_ws = TRUE)
tissue.mass <- read.delim("C:/Users/jc527762/OneDrive - James Cook University/PhD dissertation/Data/Chapter1_LocalAdaptation/enzymes/tissue_mass.txt") 

Data manipulation

Before the data can be analysed it is important to clean up the data file. I won’t explain step, that can be figured out by examining the different functions. The main steps that are occurring below are columns being broken up to make new columns (this is the separate function), or columns being combined to make a unique_sample_Id value. Time data can also be tricky to deal with in R, so there are a number of data manipulation steps being used to make sure that time is being read properly.

#--- data preparation/manipulation ---# 
ldh2 <- ldh %>%
  clean_names() %>%
  mutate(muscle_type = str_replace(muscle_type, " ", ".")) %>%
  unite("UNIQUE_SAMPLE_ID", c(fish_id,temperature,sample_index), sep="_", remove = FALSE) %>% 
  separate(creation_time, into=c('DATE','TIME'), sep = " ", remove = FALSE) %>% 
  arrange(sample_id_1, DATE, TIME) 

ldh3 <- ldh2 %>% 
  mutate(TIME = hms(ldh2$TIME)) %>% 
  mutate(TIME = chron(times=ldh2$TIME)) %>% 
  arrange(TIME) %>%
  group_by(UNIQUE_SAMPLE_ID, sample_id_1) %>% 
  mutate(TIME_DIFF = TIME - first(TIME)) %>% 
  filter(TIME != first(TIME)) %>%
  ungroup() %>% 
  mutate(TIME_DIFF_SECS = period_to_seconds(hms(TIME_DIFF))) %>% 
  mutate(MINUTES = TIME_DIFF_SECS/60) %>% 
  mutate(MINUTES = round(MINUTES, digits = 2)) %>% 
  dplyr::rename(CUVETTE = sample_id_1) %>% 
  mutate(REGION = substr(fish_id, 1, 1 ), 
         POPULATION = substr(fish_id, 2, 4), 
         SAMPLE_NO = substr(fish_id, 5, 7)) %>% 
  mutate(REGION = case_when( REGION =="L"~ "Leading", 
                             REGION == "C" ~ "Core", 
                             TRUE ~ "na"))   

Data cleaning

Next select data points will be removed. Data points have been checked using previously written app script that allows the user to look at the plotted points of samples that were run. Because we are interested in the slope, points that plateaued were removed from the analysis, as it signifies that the reaction ran out of ‘fuel’ to use. For LDH ‘fuel’ would refer to NADH, for CS ‘fuel’ would refer to Oxaloacetic acid. Samples that were removed were placed into one of five different groups, that can be found below:

  • grp1: removed last data point which was causing a plateau
  • grp2: removed last 2 data points which was causing a plateau
  • grp3: removed last 3 data points which was causing a plateau
  • grp4: removed last 4 data points which was causing a plateau
  • grp5: removed last 5 data points which was causing a plateau
grp1 <- c("CSUD008_20_1","CSUD008_20_2","CSUD008_20_3","CSUD008_20_4","CSUD008_20_5","CSUD008_20_6", 
          "CVLA047_50_1","CVLA047_50_2","CVLA047_50_3","CVLA047_50_4","CVLA047_50_5","CVLA047_50_6", 
          "CVLA046_50_1","CVLA046_50_2","CVLA046_50_3","CVLA046_50_4","CVLA046_50_5","CVLA046_50_6") 
grp2 <- c("LCKM180_30_1","LCKM180_30_2","LCKM180_30_3","LCKM180_30_4","LCKM180_30_5","LCKM180_30_6", 
          "LKES172_50_1","LKES172_50_2","CLKES172_50_3","LKES172_50_4","LKES172_50_5","LKES172_50_6", 
          "LCHA114_50_1","LCHA114_50_2","LCHA114_50_3","LCHA114_50_4","LCHA114_50_5","LCHA114_50_6", 
          "CSUD074_50_1","CSUD074_50_2","CSUD074_50_3","CSUD074_50_4","CSUD074_50_5","CSUD074_50_6")
grp3 <- c("LCKM165_50_1","LCKM165_50_2","LCKM165_50_3","LCKM165_50_4","LCKM165_50_5","LCKM165_50_6", 
          "LCKM163_50_1","LCKM163_50_2","CLCKM163_50_3","LCKM163_50_4","LCKM163_50_5","LCKM163_50_6", 
          "CTON068_50_1","CTON068_50_2","CTON068_50_3","CTON068_50_4","CTON068_50_5","CTON068_50_6", 
          "CVLA104_50_1","CVLA104_50_2","CVLA104_50_3","CVLA104_50_4","CVLA104_50_5","CVLA104_50_6") 
grp4 <- c("LCHA135_50_1","LCHA135_50_2","LCHA135_50_3","LCHA135_50_4","LCHA135_50_5","LCHA135_50_6", 
          "CTON069_50_1","CTON069_50_2","CCTON069_50_3","CTON069_50_4","CTON069_50_5","CTON069_50_6", 
          "CVLA045_50_1","CVLA045_50_2","CVLA045_50_3","CVLA045_50_4","CVLA045_50_5","CVLA045_50_6") 
grp5 <- c("CSUD014_50_1","CSUD014_50_2","CSUD014_50_3","CSUD014_50_4","CSUD014_50_5","CSUD014_50_6", 
          "CTON110_50_1","CTON110_50_2","CCTON110_50_3","CTON110_50_4","CTON110_50_5","CTON110_50_6")  

For some samples entire runs on certain cuvettes were poor. These samples were removed below, as well as samples from each grp outlined above:

ldh3.filtered <- ldh3 %>% 
  filter(!(UNIQUE_SAMPLE_ID %in% c("LCKM154_20_1", 
                                   "LKES143_30_3", 
                                   "LKES143_20_2", 
                                   "CSUD010_40_2", 
                                   "LCHA135_20_1", 
                                   "LCHA135_20_2"))) %>% 
  group_by(UNIQUE_SAMPLE_ID) %>% 
  arrange(UNIQUE_SAMPLE_ID, TIME) %>% 
  filter(!(UNIQUE_SAMPLE_ID %in% grp1 & row_number() > (n() - 1))) %>% 
  filter(!(UNIQUE_SAMPLE_ID %in% grp2 & row_number() > (n() - 2))) %>% 
  filter(!(UNIQUE_SAMPLE_ID %in% grp3 & row_number() > (n() - 3))) %>% 
  filter(!(UNIQUE_SAMPLE_ID %in% grp4 & row_number() > (n() - 4))) %>% 
  filter(!(UNIQUE_SAMPLE_ID %in% grp5 & row_number() > (n() - 5))) %>% 
  ungroup() %>% 
  mutate(UNIQUE_SAMPLE_ID = str_sub(UNIQUE_SAMPLE_ID, end = -3)) 

Great! Now we have all the data points that we want to keep. However, the data needs to manipulated in a way that we can obtain and pull out slopes from the absorption readings, and the calculate enzyme activity based on these slopes. This will involve a number of steps.

Data calculations

Step1: Extract slopes

Step1 will produce a data frame that provides you with the slope that was obtained for cuvettes 1-3 for each sample run at each experimental temperature

LDH_activity <- ldh3.filtered %>% 
  group_by(UNIQUE_SAMPLE_ID, CUVETTE) %>% 
  do({
    mod = lm(result ~ MINUTES, data = .)
    data.frame(Intercept = coef(mod)[1],
               Slope = coef(mod)[2], 
               r2 = summary(mod)$adj.r.squared)
  }) %>%
  ungroup() %>%
  filter(CUVETTE != ("6"))%>% 
  filter(CUVETTE != ("4"))%>% 
  filter(CUVETTE != ("5")) %>% 
  filter(Slope <= 0)
Step2: Slope means

Step2 will calculate the mean slope for cuvette 1-3.

LDH_activity_means <- LDH_activity %>% 
  group_by(UNIQUE_SAMPLE_ID) %>% 
  dplyr::mutate(Mean = mean(Slope)) %>% 
  ungroup()
Step3: Background activity level

Step3 will calculate background activity level by measuring the slope from cuvette 5 (postive control)

LDH_background <- ldh3 %>% 
  group_by(UNIQUE_SAMPLE_ID, CUVETTE) %>% 
  do({
    mod = lm(result ~ MINUTES, data = .)
    data.frame(Intercept = coef(mod)[1],
               Slope = coef(mod)[2])
  }) %>%
  ungroup() %>%
  filter(CUVETTE == ("5")) %>% 
  dplyr::rename(Background = Slope) %>% 
  mutate(UNIQUE_SAMPLE_ID = str_sub(UNIQUE_SAMPLE_ID, end = -3)) 
Step4: Merging dataframes

Step4 will merge the data frames that you created with the mean slopes and the background slopes.

final_table <- LDH_activity %>% 
  full_join(distinct(LDH_activity_means[,c(1,6)]), by = "UNIQUE_SAMPLE_ID") %>% 
  full_join(LDH_background[,c(1,4)], by = "UNIQUE_SAMPLE_ID") 
final_table$Mean[duplicated(final_table$Mean)] <- ""
final_table$Background[duplicated(final_table$Background)] <- ""
final_table <- final_table %>% 
  mutate(Mean = as.numeric(Mean), 
         Background = as.numeric(Background), 
         Background_perc = Background/Mean) 
Step5: Enzyme activity levels

Step5 is where enzyme activity levels are calculated. See further details in manuscript (doi: xxx). Within this step background activity level is taken into account and subtracted from slopes where background activity was >5% or more of the sample slope.

ldh.data <- final_table %>% 
  select(c(UNIQUE_SAMPLE_ID, Mean, Background, Background_perc)) %>% 
  mutate(Mean = as.numeric(Mean), 
         Background = as.numeric(Background), 
         Background_perc = as.numeric(Background_perc)) %>% 
  mutate(Background2 = case_when(Background_perc <= 0.05 ~ 0, 
                                    TRUE ~ Background), 
         LDH_ABSORBANCE = Mean - Background2) %>%
  drop_na() %>% 
  inner_join(select(ldh3.filtered, c(UNIQUE_SAMPLE_ID, REGION, POPULATION, temperature, fish_id)), by ="UNIQUE_SAMPLE_ID") %>% 
  inner_join(tissue.mass, by = "fish_id") %>% 
  mutate(TISSUE_MASS_CENTERED = scale(TISSUE_MASS, center = TRUE, scale = FALSE)) %>%
  distinct(UNIQUE_SAMPLE_ID, REGION, POPULATION, .keep_all = TRUE) %>% 
  mutate(PATH_LENGTH = 1, 
         EXTINCTION_COEFFICIENT = 6.22, 
         TISSUE_CONCENTRATION = 0.005, 
         ASSAY_VOL = 2.975, 
         SAMPLE_VOL = 0.025, 
         LDH_ACTIVITY = ((LDH_ABSORBANCE/(PATH_LENGTH*EXTINCTION_COEFFICIENT*TISSUE_CONCENTRATION))*(ASSAY_VOL/SAMPLE_VOL))*-1) %>% 
  filter(LDH_ACTIVITY >=0) %>% 
  filter(fish_id != "CVLA047")

By the end of this stage you should have a data frame that included a column called LDH_ACTIVITY along with necessary metadata - this data frame will be used to perform the statistical analysis.

Exploratory data analysis

LDH v TEMPERATURE [LATITUDE]
ggplot(ldh.data, aes(x =as.numeric(temperature), y= LDH_ACTIVITY, color = REGION)) + 
  geom_point() + geom_smooth(method = "lm", se=FALSE)

LDH V TEMPERATURE [DENSITY]
ggplot(ldh.data, aes(x = LDH_ACTIVITY, fill = temperature, color = temperature)) + 
  geom_density(alpha =0.5, position = "identity") 

LDH v TISSUE MASS (LATITUDE)
ggplot(ldh.data, aes(x =TISSUE_MASS_CENTERED, y= LDH_ACTIVITY, color = REGION)) + 
  geom_point() + geom_smooth(method = "lm", se=FALSE)

Fit the model

The model was fit using the glm and later glmmTMB package in R. A number of different models were tested to determine which hypothesis and associated variables best predicted resting oxygen consumption. Model fit was examined using AICc, BIC, and r-squared values. Additional model were examined via the validation diagonistics provided by the performance and dHARMA packages in R.

Fixed factors (linear regression models)
model 1
#--- base model ---#
ldh.model.1 <- glm(LDH_ACTIVITY ~ 1 + REGION*temperature + TISSUE_MASS_CENTERED, 
                       family=gaussian(), 
                       data = ldh.data)  

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) -99.2443321 12.5813978 -7.888180 0.0000000
REGIONLeading -22.5492426 18.7654475 -1.201636 0.2315047
temperature 6.3664611 0.3417543 18.628765 0.0000000
TISSUE_MASS_CENTERED -1.1283375 0.6114312 -1.845404 0.0670617
REGIONLeading:temperature 0.5835368 0.5079451 1.148818 0.2525623
model 2
ldh.model.2 <- glm(LDH_ACTIVITY ~ 1 + REGION*temperature, 
                       family=gaussian(), 
                       data = ldh.data)

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) -100.6938247 12.6620295 -7.952424 0.0000000
REGIONLeading -20.0428939 18.8729981 -1.061988 0.2900314
temperature 6.3664611 0.3446168 18.474030 0.0000000
REGIONLeading:temperature 0.5898524 0.5121880 1.151633 0.2513939
model comparison table
model df AICc BIC r2
ldh.model.1 6 1462.944 1480.287 0.8260431
ldh.model.2 5 1464.253 1478.780 0.8229165

The model that contains TISSUE_MASS_CENTERED seems to do better than the model that leaves TISSUE_MASS_CENTERED out. Therefore we will move ahead with the model that contains TISSUE_MASS_CENTERED as a co-variate.

Polynomials

polynomial models

Note that the linear model has already been created via model ldh.model.1 in the previous section.

#--- second order polynomial ---# 
ldh.model.1.p2 <- glm(LDH_ACTIVITY ~ 1 + REGION*poly(temperature, 2) + TISSUE_MASS_CENTERED, 
                       family=gaussian(), 
                       data = ldh.data)  

#--- third order polynomial ---# 
ldh.model.1.p3 <- glm(LDH_ACTIVITY ~ 1 + REGION*poly(temperature, 3) + TISSUE_MASS_CENTERED, 
                          family=gaussian(), 
                          data = ldh.data)  
polynomial model comparisons
model df AICc BIC r2
ldh.model.1 6 1462.944 1480.287 0.8299988
ldh.model.1.p2 8 1458.594 1481.474 0.8398699
ldh.model.1.p3 10 1458.118 1486.405 0.8452779

From our model comparison we can see that the model that runs temperature as a second order polynomial performs the best. Therefore, moving forward we will use the second order polynomial model.

Random factors

Fish were repeatedly sampled over four different temperatures, therefore repeated sampling needs to be accounted for. To do this random factors will be included within the model. There are a number of options that can be used for random factors including 1) accounting for repeated sampling of individuals, 2) accounting for repeated sampling of individuals nested within population, 3) account for repeated sampling of individuals and populations without nesting. All three models will be run a compaired.

random factor models
ldh.model.1.p2a <- glmmTMB(LDH_ACTIVITY ~ 1 + REGION*poly(temperature, 2) + TISSUE_MASS_CENTERED + (1|fish_id), 
                       family=gaussian(), 
                       data = ldh.data, 
                       REML = TRUE) 

ldh.model.1.p2b <- glmmTMB(LDH_ACTIVITY ~ 1 + REGION*poly(temperature, 2) + TISSUE_MASS_CENTERED + (1|POPULATION/fish_id), 
                  family=gaussian(), 
                  data = ldh.data, 
                  REML = TRUE) 

ldh.model.1.p2c <- glmmTMB(LDH_ACTIVITY ~ 1 + REGION*poly(temperature, 2) + TISSUE_MASS_CENTERED + (1|fish_id) + (1 + REGION|POPULATION), 
                       family=gaussian(), 
                       data = ldh.data, 
                       REML = TRUE) # convergence problem
random factor model comparisons
model df AICc BIC r2m r2c
ldh.model.1.p2a 9 1349.501 1375.101 0.8318206 0.8318206
ldh.model.1.p2b 10 1351.804 1380.091 0.8318232 0.8318232
ldh.model.1.p2c 12 NA NA 0.8318230 0.8318230

Model ldh.model.1.p2a appears to be the best model, however, there seems to be little difference in how the models change depending on how the random factors are arranged.

Model validation

performance
ldh.model.1.p2a (2nd order polynomial)

The ldh.model.1.p2a model looks like it performs well.

DHARMa residuals
ldh.model.1.p2a (3rd order polynomial))
ldh.model.1.p2a %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.236 0.052 0.004 0.012 0.42 0.228 0.516 0.576 0.476 0.7 0.728 0.676 0.824 0.844 0.956 0.784 0.352 0.2 0.956 0.252 ...
ldh.model.1.p2a %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.087946, p-value = 0.2056
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.94107, p-value = 0.792
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 147, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001722152 0.0373181816
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.006802721
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.087946, p-value = 0.2056
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.94107, p-value = 0.792
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 147, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001722152 0.0373181816
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.006802721

The model performs well and passes validation checks.

Partial plots

ggemmeans

plot_model

Model investigation

summary
Estimate StdError Zvalue Pvalue
(Intercept) 124.174707 6.502081 19.0976871 0.0000000
REGIONLeading -2.173170 9.729606 -0.2233564 0.8232582
poly(temperature, 2)1 861.389304 28.248934 30.4928076 0.0000000
poly(temperature, 2)2 84.094574 28.324236 2.9689971 0.0029877
TISSUE_MASS_CENTERED -1.120902 1.017910 -1.1011804 0.2708182
REGIONLeading:poly(temperature, 2)1 80.944462 42.071311 1.9239824 0.0543568
REGIONLeading:poly(temperature, 2)2 25.459363 42.010524 0.6060235 0.5444992
Anova
Chisq Df Pr(>Chisq)
REGION 0.0492786 1 0.8243233
poly(temperature, 2) 1862.8517066 2 0.0000000
TISSUE_MASS_CENTERED 1.2125982 1 0.2708182
REGION:poly(temperature, 2) 4.0807781 2 0.1299781
confint
2.5 % 97.5 % Estimate
(Intercept) 111.430863 136.9185516 124.174707
REGIONLeading -21.242847 16.8965076 -2.173170
poly(temperature, 2)1 806.022411 916.7561973 861.389304
poly(temperature, 2)2 28.580092 139.6090565 84.094574
TISSUE_MASS_CENTERED -3.115969 0.8741642 -1.120902
REGIONLeading:poly(temperature, 2)1 -1.513791 163.4027156 80.944462
REGIONLeading:poly(temperature, 2)2 -56.879751 107.7984777 25.459363
Std.Dev.(Intercept)|fish_id 20.115302 34.9069176 26.498362
r-squared
R2_conditional R2_marginal optional
0.9355146 0.8318206 FALSE

Pairwise comparisons

emtrends [latitudes]
ldh.model.1.p2a  %>% emtrends(var = "temperature", type = "response") %>% pairs(by = "temperature") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)

SCROLL TO THE RIGHT –>

The numbers in the left most column in the table just mention that the slopes are assuming mean TISSUE_MASS_CENTERED values when looking at differences between latitudinal slopes.

emmeans [latitudes]
ldh.model.1.p2a  %>% emmeans(pairwise ~ temperature*REGION, type = "response") %>% pairs(by = "temperature") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)
temperature
ldh.model.1.p2a  %>% emmeans(~ temperature*REGION, type = "response")  %>% summary(infer=TRUE)
Means - f(temperature)
ldh.model.1.p2a  %>% update(.~1+ REGION * as.factor(temperature) + TISSUE_MASS_CENTERED + (1|fish_id)) %>% 
  emmeans(~REGION*temperature, type = "response") %>% summary(infer=TRUE)
Abs. diff - f(temperature)
ldh.model.1.p2a  %>% update(.~1+ REGION * as.factor(temperature) + TISSUE_MASS_CENTERED + (1|fish_id)) %>% 
  emmeans(~REGION*temperature, type = "response") %>% pairs(by ="REGION") %>% summary(infer=TRUE)
Effect size
ldh.emm <- ldh.model.1.p2a %>% emmeans(~REGION*temperature)
eff_size(ldh.emm, sigma = sigma(ldh.model.1.p2a), edf=df.residual(ldh.model.1.p2a))
##  contrast                                                              
##  Core temperature35.1020408163265 - Leading temperature35.1020408163265
##  effect.size    SE  df lower.CL upper.CL
##        0.229 0.509 145   -0.778     1.24
## 
## sigma used for effect sizes: 20.9 
## Confidence level used: 0.95

Summary figure

## Warning: Removed 2 rows containing missing values (`geom_point()`).

Conclusion

  • In conclusion while LDH enzyme activity has a significantly positively correlated with temperature, however, there is no significant difference in the relationship between temperature and LDH activity when comparing fish from low- and high-latitudes.

Citrate synthase

Scenario

For initial details on the experiment performed please read the ReadMe file. In brief, Acanthochromis polyacanthus from two different regions on the Great Barrier Reef (GBR) were tested for metabolic performance at four different temperatures, 27\(^\circ\)C, 28.5\(^\circ\)C, 30\(^\circ\)C, and 31.5\(^\circ\)C. Fish used in this study were collected from two different regions, low- (i.e. Cairns) and high-latitude (i.e., Mackay), within each region fish were collected from a total of three different populations. After metabolic performance was tested blood and tissue samples were collected. White muscle tissue samples were used to look at the relationship between activity and temperature in two different enzymes, Lactate Dehydrogenase (LDH; anaerobic) and Citrate Synthase (CS: aerobic). Enzyme activity was measured over four different temperatures including 20\(^\circ\)C, 30\(^\circ\)C, 40\(^\circ\)C, and 50\(^\circ\)C. Enzyme activity was measured using a spectophotometer and wavelength absoprtion levels were recorded using the software program LabX.

Before beginning always make sure that you are working in the correct directory

knitr::opts_knit$set(root.dir=working.dir)

Now we can import that data. Two different data frames are being imported. The first has all the enzyme wave length absorption data for each sample and the tissue.mass data file contained information pertaining to the tissue samples that was used for each sample. Later on these two data frames will be merged.

Load data

cs <- read_delim("C:/Users/jc527762/OneDrive - James Cook University/PhD dissertation/Data/Chapter1_LocalAdaptation/enzymes/CS_LocalAdapt6.txt", 
                             delim = "\t", escape_double = FALSE, 
                             col_types = cols(...21 = col_skip(), 
                                              ...22 = col_skip()), trim_ws = TRUE) %>% 
  clean_names() %>% 
  mutate(creation_time = as.POSIXct(creation_time, format = "%d/%m/%Y %H:%M:%S"))
tissue.mass <- read.delim("C:/Users/jc527762/OneDrive - James Cook University/PhD dissertation/Data/Chapter1_LocalAdaptation/enzymes/tissue_mass.txt") %>% 
  dplyr::rename(FISH_ID = fish_id)

Data manipulation

Before the data can be analysed it is important to clean up the data file. I won’t explain step, that can be figured out by examining the different functions. The main steps that are occurring below are columns being broken up to make new columns (this is the separate function), or columns being combined to make a unique_sample_Id value. Time data can also be tricky to deal with in R, so there are a number of data manipulation steps being used to make sure that time is being read properly.

#--- data preparation/manipulation ---# 
cs2 <- cs %>%
  mutate(muscle_type = str_replace(muscle_type, " ", ".")) %>%
  unite("UNIQUE_SAMPLE_ID", c(fish_id,temperature,sample_index), sep="_", remove = FALSE) %>% 
  separate(creation_time, into=c('DATE','TIME'), sep = " ", remove = FALSE) %>% 
  arrange(sample_id_1, DATE, TIME) 

cs3 <- cs2 %>% 
  mutate(DATE = as.Date(creation_time), 
         TIME = format(creation_time, "%H:%M:%S")) %>%
  mutate(TIME = hms(cs2$TIME)) %>% 
  mutate(TIME = chron(times=cs2$TIME)) %>% 
  arrange(TIME) %>%
  group_by(UNIQUE_SAMPLE_ID, sample_id_1) %>% 
  mutate(TIME_DIFF = TIME - first(TIME)) %>% 
  filter(TIME != first(TIME)) %>%
  ungroup() %>% 
  mutate(TIME_DIFF_SECS = period_to_seconds(hms(TIME_DIFF))) %>% 
  mutate(MINUTES = TIME_DIFF_SECS/60) %>% 
  mutate(MINUTES = round(MINUTES, digits = 2)) %>% 
  dplyr::rename(CUVETTE = sample_id_1) %>% 
  mutate(REGION = substr(fish_id, 1, 1 ), 
         POPULATION = substr(fish_id, 2, 4), 
         SAMPLE_NO = substr(fish_id, 5, 7)) %>% 
  mutate(REGION = case_when( REGION =="L"~ "Leading", 
                             REGION == "C" ~ "Core", 
                             TRUE ~ "na"))   

Data cleaning

Next select data points will be removed. Data points have been checked using previously written app script that allows the user to look at the plotted points of samples that were run. Because we are interested in the slope, points that plateaued were removed from the analysis, as it signifies that the reaction ran out of ‘fuel’ to use. For LDH ‘fuel’ would refer to NADH, for CS ‘fuel’ would refer to Oxaloacetic acid. Samples that were removed were placed into one of five different groups. No reactions reached plateau for CS, however there were a number of runs and/or cuvettes where data quailty was poor.

cs3.filtered <- cs3 %>% 
  dplyr::rename(TEMPERATURE = temperature, 
         FISH_ID = fish_id) %>%
  filter(!(TEMPERATURE == "50" & FISH_ID == "LCKM158")) %>% 
  filter(!(TEMPERATURE == "50" & FISH_ID == "CSUD010")) %>% 
  filter(!(TEMPERATURE == "40" & FISH_ID == "CSUD018")) %>% 
  filter(!(TEMPERATURE == "20" & FISH_ID == "CTON061")) %>% 
  filter(!(TEMPERATURE == "50" & FISH_ID == "CTON065")) %>%
  filter(!(FISH_ID == "CTON069")) %>% 
  filter(!(UNIQUE_SAMPLE_ID %in% c("CTON060_50_3", 
                                   "CTON061_30_3", 
                                   "CTON061_40_3", 
                                   "CTON061_50_2", 
                                   "CTON061_50_3")))%>% 
  mutate(UNIQUE_SAMPLE_ID = str_sub(UNIQUE_SAMPLE_ID, end = -3))

Great! Now we have all the data points that we want to keep. However, the data needs to manipulated in a way that we can obtain and pull out slopes from the absorption readings, and the calculate enzyme activity based on these slopes. This will involve a number of steps.

Data calculations

Step1: Extract slopes

Step1 will produce a data frame that provides you with the slope that was obtained for cuvettes 1-3 for each sample run at each experimental temperature

CS_activity <- cs3.filtered %>% 
  dplyr::group_by(UNIQUE_SAMPLE_ID, CUVETTE) %>% 
  do({
    mod = lm(result ~ MINUTES, data = .)
    data.frame(Intercept = coef(mod)[1],
               Slope = coef(mod)[2], 
               r2 = summary(mod)$adj.r.squared)
  }) %>%
  dplyr::ungroup() %>%
  filter(CUVETTE != ("6"))%>% 
  filter(CUVETTE != ("4"))%>% 
  filter(CUVETTE != ("5"))
Step2: Slope means

Step2 will calculate the mean slope for cuvette 1-3.

CS_activity_means <- CS_activity %>%
  dplyr::group_by(UNIQUE_SAMPLE_ID) %>% 
  dplyr::mutate(Mean = mean(Slope))
Step3: Background activity level

Step3 will calculate background activity level by measuring the slope from cuvette 5 (postive control)

CS_background <- cs3.filtered %>% 
  group_by(UNIQUE_SAMPLE_ID, CUVETTE) %>% 
  do({
    mod = lm(result ~ MINUTES, data = .)
    data.frame(Intercept = coef(mod)[1],
               Slope = coef(mod)[2], 
               r2 = summary(mod)$adj.r.squared)
  }) %>%
  ungroup() %>%
  filter(CUVETTE == ("5")) %>% 
  dplyr::rename(Background = Slope)
Step4: Merging dataframes

Step4 will merge the data frames that you created with the mean slopes and the background slopes.

final_table <- CS_activity %>% 
  full_join(distinct(CS_activity_means[,c(1,6)]), by = "UNIQUE_SAMPLE_ID") %>% 
  full_join(CS_background[,c(1,4)], by = "UNIQUE_SAMPLE_ID") 
final_table$Mean[duplicated(final_table$Mean)] <- ""
final_table$Background[duplicated(final_table$Background)] <- ""
final_table <- final_table %>% 
  mutate(Mean = as.numeric(Mean), 
         Background = as.numeric(Background), 
         Background_perc = Background/Mean) 
Step5: Enzyme activity levels

Step5 is where enzyme activity levels are calculated. See further details in manuscript (doi: xxx). Within this step background activity level is taken into account and subtracted from slopes where background activity was >5% or more of the sample slope.

CS.data <- final_table %>% 
  select(c(UNIQUE_SAMPLE_ID, Mean, Background, Background_perc)) %>% 
  mutate(Mean = as.numeric(Mean), 
         Background = as.numeric(Background), 
         Background_perc = as.numeric(Background_perc)) %>% 
  mutate(Background2 = case_when(Background_perc <= 0.05 ~ 0, 
                                 TRUE ~ Background), 
         CS_ABSORBANCE = Mean - Background2) %>%
  inner_join(select(cs3.filtered, c(UNIQUE_SAMPLE_ID, REGION, POPULATION, TEMPERATURE, FISH_ID)), by ="UNIQUE_SAMPLE_ID") %>% 
  inner_join(tissue.mass, by = "FISH_ID") %>% 
  mutate(TISSUE_MASS_CENTERED = scale(TISSUE_MASS, center = TRUE, scale = FALSE)) %>%
  distinct(UNIQUE_SAMPLE_ID, REGION, POPULATION, .keep_all = TRUE) %>% 
  mutate(REGION = factor(REGION),
         PATH_LENGTH = 1, 
         EXTINCTION_COEFFICIENT = 13.6, 
         TISSUE_CONCENTRATION = 0.01, 
         ASSAY_VOL = 0.930, 
         SAMPLE_VOL = 0.020, 
         CS_ACTIVITY = ((CS_ABSORBANCE/(PATH_LENGTH*EXTINCTION_COEFFICIENT*TISSUE_CONCENTRATION))*(ASSAY_VOL/SAMPLE_VOL))) %>% 
  filter(FISH_ID != "CVLA047")

By the end of this stage you should have a data frame that included a column called LDH_ACTIVITY along with necessary metadata - this data frame will be used to perform the statistical analysis.

Exploratory data analysis

CS v TEMPERATURE [LATITUDE]
ggplot(CS.data, aes(x =as.numeric(TEMPERATURE), y= CS_ACTIVITY, color = REGION)) + 
  geom_point() + geom_smooth(method = "lm", se=FALSE)

CS V TEMPERATURE [DENSITY]
ggplot(CS.data, aes(x = CS_ACTIVITY, fill = TEMPERATURE, color = TEMPERATURE)) + 
  geom_density(alpha =0.5, position = "identity") 

CS v TISSUE MASS (LATITUDE)
ggplot(CS.data, aes(x =TISSUE_MASS_CENTERED, y= CS_ACTIVITY, color = REGION)) + 
  geom_point() + geom_smooth(method = "lm", se=FALSE)

Fit the model

The model was fit using the glm and later glmmTMB package in R. A number of different models were tested to determine which hypothesis and associated variables best predicted resting oxygen consumption. Model fit was examined using AICc, BIC, and r-squared values. Additional model were examined via the validation diagonistics provided by the performance and dHARMA packages in R.

Fixed factors (linear regression models)
model 1
#--- base model ---#
cs.model.1 <- glm(CS_ACTIVITY ~ 1 + REGION*TEMPERATURE + TISSUE_MASS_CENTERED, 
                       family=gaussian(), 
                       data = CS.data)  

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.3917427 0.4681525 -2.9728401 0.0035409
REGIONLeading -0.6551419 0.6711737 -0.9761138 0.3308934
TEMPERATURE 0.1491176 0.0128965 11.5626489 0.0000000
TISSUE_MASS_CENTERED 0.0065822 0.0217986 0.3019550 0.7631882
REGIONLeading:TEMPERATURE 0.0372828 0.0183990 2.0263512 0.0448568
model 2
cs.model.2 <- glm(CS_ACTIVITY ~ 1 + REGION*TEMPERATURE, 
                       family=gaussian(), 
                       data = CS.data) 

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.3792002 0.4646214 -2.968439 0.0035836
REGIONLeading -0.6734697 0.6660086 -1.011203 0.3138573
TEMPERATURE 0.1489822 0.0128421 11.601054 0.0000000
REGIONLeading:TEMPERATURE 0.0374138 0.0183274 2.041412 0.0432978
model comparison table
model df AICc BIC r2
cs.model.1 6 414.3969 430.9192 0.7286043
cs.model.2 5 412.2926 426.1464 0.7299815

The model that contains TISSUE_MASS_CENTERED seems to do better than the model that leaves TISSUE_MASS_CENTERED out. Therefore we will move ahead with the model that contains TISSUE_MASS_CENTERED as a co-variate.

Polynomials

polynomial models

Note that the linear model has already been created via model cs.model.1 in the previous section.

##--- second order polynomial ---# 
cs.model.1.p2 <- glm(CS_ACTIVITY ~ 1 + REGION*poly(TEMPERATURE, 2) + TISSUE_MASS_CENTERED, 
                      family=gaussian(), 
                      data = CS.data)  

#--- third order polynomial ---# 
cs.model.1.p3 <- glm(CS_ACTIVITY ~ 1 + REGION*poly(TEMPERATURE, 3) + TISSUE_MASS_CENTERED, 
                      family=gaussian(), 
                      data = CS.data)  
polynomial model comparisons
model df AICc BIC r2
cs.model.1 6 414.3969 430.9192 0.7347879
cs.model.1.p2 8 417.7056 439.4558 0.7372215
cs.model.1.p3 10 421.9615 448.7881 0.7380345

From our model comparison we can see that the model that runs temperature as a linear model performs the best. Therefore, moving forward we will use the linear model.

Random factors

Fish were repeatedly sampled over four different temperatures, therefore repeated sampling needs to be accounted for. To do this random factors will be included within the model. There are a number of options that can be used for random factors including 1) accounting for repeated sampling of individuals, 2) accounting for repeated sampling of individuals nested within population, 3) account for repeated sampling of individuals and populations without nesting. All three models will be run a compaired.

random factor models
cs.model.1a <- glmmTMB(CS_ACTIVITY ~ 1 + REGION*TEMPERATURE + TISSUE_MASS_CENTERED + (1|FISH_ID), 
                       family=gaussian(), 
                       data = CS.data, 
                       REML = TRUE) 

cs.model.1b <- glmmTMB(CS_ACTIVITY ~ 1 + REGION*TEMPERATURE + TISSUE_MASS_CENTERED + (1|POPULATION/FISH_ID), 
                       family=gaussian(), 
                       data = CS.data,
                       REML = TRUE) 

cs.model.1c <- glmmTMB(CS_ACTIVITY ~ 1 + REGION*TEMPERATURE + TISSUE_MASS_CENTERED + (1|FISH_ID) + (1 + REGION|POPULATION), 
                       family=gaussian(), 
                       data = CS.data,
                       REML = TRUE)
random factor model comparisons
model df AICc BIC r2m r2c
cs.model.1a 7 368.4369 387.5916 0.7237186 0.7237186
cs.model.1b 8 369.5447 391.2949 0.7114656 0.7114656
cs.model.1c 10 373.7411 400.5677 0.7110716 0.7110716

Model cs.model.1a appears to be the best model, however, there seems to be little difference in how the models change depending on how the random factors are arranged.

Model validation

performance
cs.model.1a (linear)

DHARMa residuals
cs.model.1a (linear)
cs.model.1a %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.724 0.74 0.66 0.72 0.156 0.196 0.08 0.068 0.36 0.464 0.792 0.784 0.588 0.416 0.04 0.484 0.828 0.64 0.2 0.168 ...
cs.model.1a %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.065231, p-value = 0.6377
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.97307, p-value = 0.984
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 130, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001947334 0.0421127390
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.007692308
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.065231, p-value = 0.6377
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.97307, p-value = 0.984
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 130, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001947334 0.0421127390
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.007692308

The cs.model.1a model looks okay….lets play around with a link transformation to see if we can get any improvement

Model re-validation

performance
Gaussian (identity)

Gaussian (log)

Gaussian (inverse)

DHARMa
Gaussian (identity)
cs.model.1a %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.724 0.74 0.66 0.72 0.156 0.196 0.08 0.068 0.36 0.464 0.792 0.784 0.588 0.416 0.04 0.484 0.828 0.64 0.2 0.168 ...
cs.model.1a %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.065231, p-value = 0.6377
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.97307, p-value = 0.984
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 130, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001947334 0.0421127390
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.007692308
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.065231, p-value = 0.6377
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.97307, p-value = 0.984
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 130, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001947334 0.0421127390
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.007692308
Gaussian (log)
cs.model.1a.log %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.676 0.864 0.752 0.664 0.036 0.144 0.112 0.088 0.1 0.524 0.836 0.748 0.64 0.568 0.06 0.256 0.896 0.58 0.012 0.14 ...
cs.model.1a.log %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.091692, p-value = 0.2244
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.98328, p-value = 0.944
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 0, observations = 130, p-value = 0.6309
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.00000000 0.02797718
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                                      0
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.091692, p-value = 0.2244
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.98328, p-value = 0.944
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 0, observations = 130, p-value = 0.6309
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.00000000 0.02797718
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                                      0
Gaussian (inverse)
cs.model.1a.inv %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.512 0.856 0.864 0.652 0.032 0.104 0.136 0.08 0.064 0.396 0.848 0.516 0.6 0.564 0.048 0.148 0.888 0.524 0.008 0.144 ...
cs.model.1a.inv %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.11877, p-value = 0.05107
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.58039, p-value = 0.408
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 0, observations = 130, p-value = 0.6309
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.00000000 0.02797718
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                                      0
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.11877, p-value = 0.05107
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.58039, p-value = 0.408
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 0, observations = 130, p-value = 0.6309
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.00000000 0.02797718
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                                      0

From looking at the different models it looks like the model with the log-link function performs the best. In the DHARMa validation test we can see that one of quantile deviations is violated. Because the model passes all the other data validations realtively well we could move on with the log-link model. However, previously we showed that the 2nd and 3rd order polynomials also performed quite well, and we know the LDH model was not linear. So before we choose out final model, lets see what the 2nd and 3rd order polynomials look like with a log-link.

Model re-re-validation

performance
Gaussian (quadratic-log)

DHARMa
Gaussian (quadratic-log)
cs.model.1a.log.p2 %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.84 0.864 0.696 0.724 0.04 0.096 0.064 0.1 0.164 0.484 0.8 0.9 0.624 0.444 0.072 0.36 0.896 0.636 0.036 0.072 ...
cs.model.1a.log.p2 %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.070769, p-value = 0.533
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 1.087, p-value = 0.616
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 130, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001947334 0.0421127390
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.007692308
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.070769, p-value = 0.533
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 1.087, p-value = 0.616
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 130, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001947334 0.0421127390
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.007692308

Validations look great! Moving ahead with the quadratic log-link model.

Partial plots

ggemmeans

plot_model

Model investigation

summary
Estimate StdError Zvalue Pvalue
(Intercept) 1.1984073 0.0564498 21.2295969 0.0000000
REGIONLeading 0.1356647 0.0820867 1.6526997 0.0983920
poly(TEMPERATURE, 2)1 5.3642867 0.2711131 19.7861598 0.0000000
poly(TEMPERATURE, 2)2 -0.8451058 0.2110951 -4.0034360 0.0000624
TISSUE_MASS_CENTERED 0.0024962 0.0082421 0.3028589 0.7619974
REGIONLeading:poly(TEMPERATURE, 2)1 0.4346926 0.3715558 1.1699254 0.2420310
REGIONLeading:poly(TEMPERATURE, 2)2 0.1150385 0.2844006 0.4044947 0.6858490
Anova
Chisq Df Pr(>Chisq)
REGION 4.3163842 1 0.0377470
poly(TEMPERATURE, 2) 1234.7927952 2 0.0000000
TISSUE_MASS_CENTERED 0.0917235 1 0.7619974
REGION:poly(TEMPERATURE, 2) 3.5245189 2 0.1716566
confint
2.5 % 97.5 % Estimate
(Intercept) 1.0877676 1.3090469 1.1984073
REGIONLeading -0.0252223 0.2965517 0.1356647
poly(TEMPERATURE, 2)1 4.8329148 5.8956585 5.3642867
poly(TEMPERATURE, 2)2 -1.2588446 -0.4313670 -0.8451058
TISSUE_MASS_CENTERED -0.0136580 0.0186503 0.0024962
REGIONLeading:poly(TEMPERATURE, 2)1 -0.2935434 1.1629285 0.4346926
REGIONLeading:poly(TEMPERATURE, 2)2 -0.4423764 0.6724535 0.1150385
Std.Dev.(Intercept)|FISH_ID 0.1678718 0.2807799 0.2171060
r-squared
R2_conditional R2_marginal optional
0.551021 0.4639599 FALSE

Pairwise comparisons

emtrends [latitudes]
cs.model.1a.log.p2  %>% emtrends(var = "TEMPERATURE", type = "response") %>% pairs(by = "TEMPERATURE") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)

SCROLL TO THE RIGHT –>

The numbers in the left most column in the table just mention that the slopes are assuming mean TISSUE_MASS_CENTERED values when looking at differences between latitudinal slopes.

emmeans [latitudes]
cs.model.1a.log.p2  %>% emmeans(pairwise ~ TEMPERATURE*REGION, type = "response") %>% pairs(by = "TEMPERATURE") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)
TEMPERATURE
cs.model.1a.log.p2  %>% emmeans(~ TEMPERATURE*REGION, type = "response")  %>% summary(infer=TRUE)
Means - f(TEMPERATURE)
cs.model.1a.log.p2  %>% update(.~1+ REGION * as.factor(TEMPERATURE) + TISSUE_MASS_CENTERED + (1|FISH_ID)) %>% 
  emmeans(~REGION*TEMPERATURE, type = "response") %>% summary(infer=TRUE)
Abs. diff - f(TEMPERATURE)
cs.model.1a.log.p2  %>% update(.~1+ REGION * as.factor(TEMPERATURE) + TISSUE_MASS_CENTERED + (1|FISH_ID)) %>% 
  emmeans(~REGION*TEMPERATURE, type = "response") %>% pairs(by ="REGION") %>% summary(infer=TRUE)
Effect size
cs.emm <- cs.model.1a.log.p2 %>% emmeans(~REGION*TEMPERATURE)
eff_size(cs.emm, sigma = sigma(cs.model.1a.log.p2), edf=df.residual(cs.model.1a.log.p2))
##  contrast                                                              
##  Core TEMPERATURE34.6153846153846 - Leading TEMPERATURE34.6153846153846
##  effect.size   SE  df lower.CL upper.CL
##        -0.25 0.17 121   -0.586    0.086
## 
## sigma used for effect sizes: 0.493 
## Confidence level used: 0.95

Summary figure

Conclusion

  • In conclusion CS enzyme activity has a significantly positively correlated with temperature, however, there is no significant difference in the relationship between temperature and CS activity when comparing fish from low- and high-latitudes.

Lactate dehydrogenase: citrate synthase

Before beginning always make sure that you are working in the correct directory

knitr::opts_knit$set(root.dir=working.dir)

Now we can import that data. Two different data frames are being imported. The first has all the enzyme wave length absorption data for each sample and the tissue.mass data file contained information pertaining to the tissue samples that was used for each sample. Later on these two data frames will be merged.

Load data

Data manipulation

Both LDH and CS dataframes have been previously cleaned and manipulated, therefore, the only remaining step is to join the data frames together and then make another column that has the LDH:CS ratio

#--- data preparation/manipulation ---# 
ldh.cs.data <- ldh.data %>% 
  inner_join(select(cs.data, c("UNIQUE_SAMPLE_ID","CS_ACTIVITY")), by = "UNIQUE_SAMPLE_ID") %>% 
  mutate(LCr = LDH_ACTIVITY/CS_ACTIVITY)

Exploratory data analysis

LDH-CS v TEMPERATURE [LATITUDE]
ggplot(ldh.cs.data, aes(x =as.numeric(temperature), y= LCr, color = REGION)) + 
  geom_point() + geom_smooth(method = "lm", se=FALSE)

LDH-CS V TEMPERATURE [DENSITY]
ggplot(ldh.cs.data, aes(x = LCr)) + 
  geom_density(alpha =0.5, position = "identity") 

LDH-CS v TISSUE MASS (LATITUDE)
ggplot(ldh.cs.data, aes(x =TISSUE_MASS_CENTERED, y= LCr, color = REGION)) + 
  geom_point() + geom_smooth(method = "lm", se=FALSE)

Fit the model

The model was fit using the glm and later glmmTMB package in R. A number of different models were tested to determine which hypothesis and associated variables best predicted resting oxygen consumption. Model fit was examined using AICc, BIC, and r-squared values. Additional model were examined via the validation diagnostics provided by the performance and dHARMA packages in R.

Fixed factors (linear regression models)
model 1
#--- base model ---#
ldh.cs.model.1 <- glm(LCr~ 1 + REGION*temperature + TISSUE_MASS_CENTERED, 
                       family=gaussian(), 
                       data = ldh.cs.data)  

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.5559544 4.6644620 3.5493813 0.0005463
REGIONLeading -11.5092807 6.7532864 -1.7042489 0.0908391
temperature 0.4237283 0.1285284 3.2967687 0.0012761
TISSUE_MASS_CENTERED -0.5178041 0.2176780 -2.3787622 0.0188971
REGIONLeading:temperature 0.1482384 0.1847021 0.8025813 0.4237523
model 2
ldh.cs.model.2 <- glm(LCr ~ 1 + REGION*temperature, 
                       family=gaussian(), 
                       data = ldh.cs.data) 

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) 15.6040768 4.7330711 3.2968186 0.0012733
REGIONLeading -10.2121229 6.8555171 -1.4896211 0.1388428
temperature 0.4343827 0.1308220 3.3204114 0.0011784
REGIONLeading:temperature 0.1412875 0.1880888 0.7511746 0.4539595
model comparison table
model df AICc BIC r2
ldh.cs.model.1 6 1004.464 1020.934 0.2473697
ldh.cs.model.2 5 1008.019 1021.831 0.2152221

The model that contains TISSUE_MASS_CENTERED seems to do better than the model that leaves TISSUE_MASS_CENTERED out. Therefore we will move ahead with the model that contains TISSUE_MASS_CENTERED as a co-variate.

Polynomials
polynomial models

Note that the linear model has already been created via model ldh.cs.model.1 in the previous section.

#--- second order polynomial ---# 
ldh.cs.model.1.p2 <- glm(LCr ~ 1 + REGION*poly(temperature, 2) + TISSUE_MASS_CENTERED, 
                      family=gaussian(), 
                      data = ldh.cs.data)  

#--- third order polynomial ---# 
ldh.cs.model.1.p3 <- glm(LCr ~ 1 + REGION*poly(temperature, 3) + TISSUE_MASS_CENTERED, 
                      family=gaussian(), 
                      data = ldh.cs.data)  

polynomial model comparisons

model df AICc BIC r2
ldh.cs.model.1 6 1004.464 1020.934 0.2533279
ldh.cs.model.1.p2 8 1007.305 1028.984 0.2629313
ldh.cs.model.1.p3 10 1010.107 1036.841 0.2734990

From our model comparison we can see that the model that runs temperature as a linear model performs the best. Therefore, moving forward we will use the linear model.

Random factors

Fish were repeatedly sampled over four different temperatures, therefore repeated sampling needs to be accounted for. To do this random factors will be included within the model. There are a number of options that can be used for random factors including 1) accounting for repeated sampling of individuals, 2) accounting for repeated sampling of individuals nested within population, 3) account for repeated sampling of individuals and populations without nesting. All three models will be run and compared.

random factor models
ldh.cs.model.1a <- glmmTMB(LCr ~ 1 + REGION*temperature + TISSUE_MASS_CENTERED + (1|fish_id), 
                       family=gaussian(), 
                       data = ldh.cs.data, 
                       REML = TRUE) 

ldh.cs.model.1b <- glmmTMB(LCr ~ 1 + REGION*temperature + TISSUE_MASS_CENTERED + (1|POPULATION/fish_id), 
                       family=gaussian(), 
                       data = ldh.cs.data,
                       REML = TRUE) 

ldh.cs.model.1c <- glmmTMB(LCr ~ 1 + REGION*temperature + TISSUE_MASS_CENTERED + (1|fish_id) + (1 + REGION|POPULATION), 
                       family=gaussian(), 
                       data = ldh.cs.data,
                       REML = TRUE) # convergnece problem

random factor model comparisons

model df AICc BIC r2m r2c
ldh.cs.model.1a 7 951.3505 970.4436 0.2394559 0.2394559
ldh.cs.model.1b 8 953.6249 975.3034 0.2394565 0.2394565
ldh.cs.model.1c 10 NA NA 0.2394547 0.2394547

Model ldh.cs.model.1a appears to be the best model, however, there seems to be little difference in how the models change depending on how the random factors are arranged.

Model validation

performance
ldh.cs.model.1a (linear)

DHARMa residuals
ldh.cs.model.1a (linear)
ldh.cs.model.1a %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.06 0.016 0.024 0.072 0.952 0.444 0.908 0.896 0.768 0.708 0.384 0.612 0.768 0.928 0.972 0.252 0.12 0.384 1 0.844 ...
ldh.cs.model.1a %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.067628, p-value = 0.5968
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.91578, p-value = 0.624
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 129, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001962428 0.0424334251
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.007751938
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.067628, p-value = 0.5968
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.91578, p-value = 0.624
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 129, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001962428 0.0424334251
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.007751938

The ldh.cs.model.1a model looks good, and there seem to be no major violations of assumptions.

Partial plots

ggemmeans

plot_model

Model investigation

summary
Estimate StdError Zvalue Pvalue
(Intercept) 15.5847436 3.6972542 4.215221 0.0000250
REGIONLeading -10.9295216 5.3808871 -2.031175 0.0422373
temperature 0.4476407 0.0805748 5.555595 0.0000000
TISSUE_MASS_CENTERED -0.4760598 0.3745783 -1.270922 0.2037564
REGIONLeading:temperature 0.1368442 0.1154047 1.185777 0.2357104
Anova
Chisq Df Pr(>Chisq)
REGION 2.960875 1 0.0853018
temperature 79.469080 1 0.0000000
TISSUE_MASS_CENTERED 1.615243 1 0.2037564
REGION:temperature 1.406067 1 0.2357104
confint
2.5 % 97.5 % Estimate
(Intercept) 8.3382585 22.8312287 15.5847436
REGIONLeading -21.4758666 -0.3831766 -10.9295216
temperature 0.2897171 0.6055643 0.4476407
TISSUE_MASS_CENTERED -1.2102198 0.2581001 -0.4760598
REGIONLeading:temperature -0.0893448 0.3630331 0.1368442
Std.Dev.(Intercept)|fish_id 7.1479866 12.7455875 9.5449091
r-squared
R2_conditional R2_marginal optional
0.7281208 0.2394559 FALSE

Pairwise comparisons

emtrends [latitudes]
ldh.cs.model.1a  %>% emtrends(var = "temperature", type = "response") %>% pairs(by = "temperature") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)

SCROLL TO THE RIGHT –>

The numbers in the left most column in the table just mention that the slopes are assuming mean TISSUE_MASS_CENTERED values when looking at differences between latitudinal slopes.

emmeans [latitudes]
ldh.cs.model.1a  %>% emmeans(pairwise ~ temperature*REGION, type = "response") %>% pairs(by = "temperature") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)
TEMPERATURE
ldh.cs.model.1a  %>% emmeans(~ temperature*REGION, type = "response")  %>% summary(infer=TRUE)
Means - f(TEMPERATURE)
ldh.cs.model.1a  %>% update(.~1+ REGION * as.factor(temperature) + TISSUE_MASS_CENTERED + (1|fish_id)) %>% 
  emmeans(~REGION*temperature, type = "response") %>% summary(infer=TRUE)
Abs. diff - f(TEMPERATURE)
ldh.cs.model.1a  %>% update(.~1+ REGION * as.factor(temperature) + TISSUE_MASS_CENTERED + (1|fish_id)) %>% 
  emmeans(~REGION*temperature, type = "response") %>% pairs(by ="REGION") %>% summary(infer=TRUE)
Effect size
ldh.cs.emm <- ldh.cs.model.1a %>% emmeans(~REGION*temperature)
eff_size(ldh.cs.emm, sigma = sigma(ldh.cs.model.1a), edf=df.residual(ldh.cs.model.1a))
##  contrast                                                              
##  Core temperature34.7286821705426 - Leading temperature34.7286821705426
##  effect.size    SE  df lower.CL upper.CL
##        0.868 0.507 127   -0.136     1.87
## 
## sigma used for effect sizes: 7.12 
## Confidence level used: 0.95

Summary figure

Conclusion

  • In conclusion the LDH:CS ratio has a significantly positively correlated with temperature, however, there is no significant difference in the relationship between temperature and LDH:CS when comparing fish from low- and high-latitudes.

Immunocompetence

Scenario

For initial details on the experiment performed please read the ReadMe file. In breif, Acanthochromis polyacanthus from two different regions on the Great Barrier Reef (GBR) were tested for metabolic performance at four different temperatures, 27\(^\circ\)C, 28.5\(^\circ\)C, 30\(^\circ\)C, and 31.5\(^\circ\)C. Fish used in this study were collected from two different regions, low- (i.e. Cairns) and high-latitude (i.e., Mackay), within each region fish were collected from a total of three different populations.

Immunocompetence was tested via phytohaemaglutinin (PHA) swelling assays at the same four experimental temperatures metabolic performance was tested at. To perform the assay fish were injected with 0.03 mL of PHA subcutaneously in the caudal peduncle. Thickness of injection site was measured pre-injection as well as 18-24hour post-injection. PHA produces a localized, cell-mediated response (e.g., inflammation, T-cell proliferation, etc). The change in thickness between measurement periods was used as an proxy for immunocompetence.

Read in the data

Before beginning always make sure that you are working in the correct directory

knitr::opts_knit$set(root.dir=working.dir)

Now we can import that data. Replace import data with the PATH to your data file. I have secretly labelled my PATH import.data (i.e. import.data = “PATH TO MY FILE”)

Load data

pha <- import.data

Data manipulation

Before the data can be analysed it is important to clean up the data file. Below a number of adjustments are made, primarily making sure that columns are being treated appropriately as either factors, numeric, or as time, as well as the renaming of some columns.

pha2 <- pha %>% 
  dplyr::rename(PHA_28.5 = PHA_285) %>%
  mutate(FISH_ID = factor(FISH_ID), 
         POPULATION = factor(POPULATION), 
         REGION = factor(REGION), 
         TANK = factor(TANK), 
         PHA_28.5 = as.numeric(PHA_28.5), 
         MASS_CENTERED = scale(MASS, center = TRUE, scale = FALSE)) %>% 
  pivot_longer(cols = c(PHA_27, 
                        PHA_28.5, 
                        PHA_30, 
                        PHA_31.5), 
               names_to = 'PHA', 
               values_to = 'IMMUNE_RESPONSE') %>% 
  separate(col = PHA, 
           into = c('TEST','TEMPERATURE'), sep = '_') %>% 
  filter(IMMUNE_RESPONSE >= 0.01) %>% # removing negative values greater than -0.05
  mutate(TEMPERATURE = as.numeric(TEMPERATURE))

Great! That is everything for data manipulation

Exploratory data analysis

PHA V TEMP

ggplot(pha2, aes(x=TEMPERATURE, y=IMMUNE_RESPONSE)) + 
  geom_violin(alpha = 0.5) +  # four potential outliers but will keep for now 
  geom_point() 

PHA v TEMP (LATITUDE)

ggplot(pha2, aes(x=TEMPERATURE, y=IMMUNE_RESPONSE, fill = REGION, color = REGION)) + 
  geom_violin(alpha = 0.5) + 
  geom_point(position = position_jitterdodge(dodge.width = 0.9, jitter.width = 0), color = "black")

PHA v MASS (LATITUDE)

ggplot(pha2, aes(x=MASS_CENTERED, y=IMMUNE_RESPONSE, fill = REGION, color = REGION)) +
  geom_point() + geom_smooth(method = "lm")

Fit the model

The model was fit using the glm and later glmmTMB package in R. A number of different models were tested to determine which hypothesis and associated variables best predicted resting oxygen consumption. Model fit was examined using AICc, BIC, and r-squared values. Additional model were examined via the validation diagnostics provided by the performance and dHARMA packages in R.

Fixed factors (linear regression models)

model 1
#--- base model ---#
pha.1 <- glm(IMMUNE_RESPONSE ~ 1 + REGION * TEMPERATURE + MASS_CENTERED, 
                     family=gaussian(), 
                     data = pha2) 
summary
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.7185981 0.4222908 4.0697030 0.0000750
REGIONLeading -0.6055925 0.6198923 -0.9769318 0.3301350
TEMPERATURE -0.0504956 0.0146466 -3.4476083 0.0007294
MASS_CENTERED 3.0066564 2.2817687 1.3176868 0.1895654
REGIONLeading:TEMPERATURE 0.0207950 0.0213983 0.9718052 0.3326714
model 2
#--- experimental rmr equipment hypothesis ---#
pha.2 <- glm(IMMUNE_RESPONSE ~ 1 + REGION * TEMPERATURE, 
                     family=gaussian(), 
                     data = pha2)  
summary
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.6908547 0.4227662 3.9995034 0.0000980
REGIONLeading -0.6042622 0.6213620 -0.9724801 0.3323269
TEMPERATURE -0.0489398 0.0146335 -3.3443600 0.0010344
REGIONLeading:TEMPERATURE 0.0195688 0.0214288 0.9132038 0.3625536
model comparison table
model df AICc BIC r2
pha.1 6 -22.05659 -4.195801 0.1032647
pha.2 5 -22.43443 -7.482064 0.0939358

There is little difference between the two initial models, therefore, we will move forward with the model that has less terms.

It looks like the third model is better than the previous two. Next we will test to see if the variable temperature performs best as a 1st (linear), 2nd (quadratic), or 3rd (cubic) order polynomial. As the relationship between temperature and resting oxygen consumption is predicted to be non-linear.

Polynomials

polynomial models

Note that the linear model has already been created via model pha.2 in the previous section.

pha.2.p2 <- glm(IMMUNE_RESPONSE ~ 1 + REGION * poly(TEMPERATURE, 2), 
                 family=gaussian(),
                 data = pha2)  

pha.2.p3 <- glm(IMMUNE_RESPONSE ~ 1 + REGION * poly(TEMPERATURE, 3), 
                 family=gaussian(),
                 data = pha2)
polynomial model comparisons
model df AICc BIC r2
pha.2 5 -22.43443 -7.482064 0.0955802
pha.2.p2 7 -33.95206 -13.211452 0.1814782
pha.2.p3 9 -36.46535 -10.053268 0.2166317

From our model comparison we can see that the model improves when TEMPERATURE is modeled as 2\(^nd\) or 3\(^rd\) order polynomial. The model that implements a 3\(^rd\) order polynomial performs the best, and therefore, we will be moving forward with this model.

Random factors

Fish were repeatedly sampled over four different temperatures, therefore repeated sampling needs to be accounted for. To do this random factors will be included within the model. There are a number of options that can be used for random factors including 1) accounting for repeated sampling of individuals, 2) accounting for repeated sampling of individuals nested within population, 3) account for repeated sampling of individuals and populations without nesting. All three models will be run and compared.

random factor models
pha.2.p3a <- glmmTMB(IMMUNE_RESPONSE ~ 1 + REGION * poly(TEMPERATURE, 3) + (1|FISH_ID), 
                  family=gaussian(),
                  data = pha2,
                  REML = TRUE) 


pha.2.p3b <- glmmTMB(IMMUNE_RESPONSE ~ 1 + REGION * poly(TEMPERATURE, 3) + (1|POPULATION/FISH_ID), 
                  family=gaussian(),
                  data = pha2,
                  REML = TRUE)

pha.2.p3c <- glmmTMB(IMMUNE_RESPONSE ~ 1 + REGION * poly(TEMPERATURE, 3) + (1|FISH_ID) + (1|POPULATION), 
                  family=gaussian(),
                  data = pha2,
                  REML = TRUE)
random factor model comparisons
model df AICc BIC r2m r2c
pha.2.p3a 10 -19.04376 10.15879 0.2090404 0.2090404
pha.2.p3b 11 -18.95044 13.01158 0.2022862 0.2517016
pha.2.p3c 11 -18.95044 13.01158 0.2022862 0.2517016

There is little difference between the models, however, the nest model does seem to a bit better than the none nested model that only includes (1|FISH_ID) for this variable. There no difference between the second and third model, either could be used. Moving forward the second model with the nested random effects will be used.

Model validation

performance

pha.2.p3b

DHARMa residuals

pha.2.p3b
pha.2.p3b %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.336 0.116 0.828 0.4 0.232 0.192 0.504 0.036 0.284 0.42 0.42 0.52 0.476 0.648 0.536 0.192 0.172 0.52 0.44 0.74 ...
pha.2.p3b %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.10325, p-value = 0.06743
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.94119, p-value = 0.656
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 2, observations = 159, p-value = 0.3618
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.001526975 0.044697834
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01257862
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.10325, p-value = 0.06743
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.94119, p-value = 0.656
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 2, observations = 159, p-value = 0.3618
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.001526975 0.044697834
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01257862

The pha.2.p3b model performs well, however, in the model validation performed by the performance package our modeled predictive lines aren’t matching up with our observed data as well as we might hope. There are also some issues with the residuals within our DHARMa validations. Let’s see if we can fix this by including some different link functions within out model.

Model re-validation

performance

Gaussian (identity)

Gaussian (log)

Gaussian (inverse)

DHARMa

Gaussian (identity)
pha.2.p3b %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.34 0.104 0.84 0.396 0.228 0.192 0.496 0.032 0.284 0.428 0.408 0.536 0.464 0.652 0.544 0.188 0.172 0.536 0.44 0.748 ...
pha.2.p3b %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.096377, p-value = 0.1043
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 1.0064, p-value = 0.912
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 2, observations = 159, p-value = 0.3618
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.001526975 0.044697834
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01257862
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.096377, p-value = 0.1043
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 1.0064, p-value = 0.912
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 2, observations = 159, p-value = 0.3618
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.001526975 0.044697834
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                             0.01257862
Gaussian (log)
pha.2.p3b.log %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.34 0.096 0.856 0.384 0.28 0.172 0.564 0.024 0.324 0.448 0.376 0.568 0.524 0.692 0.632 0.188 0.18 0.556 0.532 0.82 ...
pha.2.p3b.log %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.070591, p-value = 0.4065
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 1.0116, p-value = 0.88
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 159, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001592188 0.0345421401
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.006289308
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.070591, p-value = 0.4065
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 1.0116, p-value = 0.88
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 1, observations = 159, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.0001592188 0.0345421401
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                            0.006289308
Gaussian (inverse)
pha.2.p3b.inv %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.352 0.128 0.9 0.372 0.236 0.212 0.596 0.032 0.312 0.424 0.42 0.596 0.54 0.688 0.636 0.208 0.196 0.548 0.484 0.84 ...
pha.2.p3b.inv %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.080302, p-value = 0.2568
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.099362, p-value = 0.872
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 0, observations = 159, p-value = 0.6421
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.00000000 0.02293344
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                                      0
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.080302, p-value = 0.2568
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.099362, p-value = 0.872
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 0, observations = 159, p-value = 0.6421
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.00000000 0.02293344
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                                      0

Adding log or inverse link functions to the model does not help. In fact, it seems to make the model worse! From here we can try to experiment with different distributions. The first distribution that comes to mind is the Gamma distribution, as it can be helpful when dealing with skewed data when the data set contains no zeros and all positive values.

Fit model - alternative distributions

pha.2.p3b <- glmmTMB(IMMUNE_RESPONSE ~ 1 + REGION * poly(TEMPERATURE, 3) + (1|POPULATION/FISH_ID), 
                  family=gaussian(),
                  data = pha2,
                  REML = FALSE) 

pha.2.p3b.gamma <- glmmTMB(IMMUNE_RESPONSE~ 1 + REGION* poly(TEMPERATURE, 3) + (1|POPULATION/FISH_ID), 
                       family=Gamma(link="log"), # default option
                       data = pha2, 
                       REML = FALSE)
model df AICc BIC r2m r2c
pha.2.p3b 11 -32.7787 -0.8166702 0.2153506 0.2153506
pha.2.p3b.gamma 11 -136.0719 -104.1099081 0.2635317 0.2635317

From this model comparison we can see that the model fitted with the Gamma distribution performs much better than the model fitted with the gaussian distribution. Let’s look at the model validation plots for out Gamma model.

Model re-re-validation

performance

Gamma distribution

Looks better

DHARMa

Gamma distribution
pha.2.p3b.gamma %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.288 0.176 0.888 0.212 0.16 0.3 0.664 0.008 0.388 0.504 0.6 0.676 0.6 0.888 0.656 0.368 0.012 0.736 0.552 0.86 ...
pha.2.p3b.gamma %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.059044, p-value = 0.6364
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.7999, p-value = 0.4
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 0, observations = 159, p-value = 0.6421
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.00000000 0.02293344
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                                      0
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.059044, p-value = 0.6364
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.7999, p-value = 0.4
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 0, observations = 159, p-value = 0.6421
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.00000000 0.02293344
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                                      0

Looks much better!

The Gamma does a decent job of modelling our data and we can move forward with it and start to investigate the model. #### {-}

Partial plots

ggemmeans

plot_model

Model investigation

summary

Estimate StdError Zvalue Pvalue
(Intercept) -1.3847661 0.0907111 -15.2656703 0.0000000
REGIONLeading -0.1636031 0.1348218 -1.2134770 0.2249475
poly(TEMPERATURE, 3)1 -4.8549609 1.1206614 -4.3322281 0.0000148
poly(TEMPERATURE, 3)2 -2.6825503 1.1140906 -2.4078385 0.0160473
poly(TEMPERATURE, 3)3 1.1279198 1.1107719 1.0154378 0.3098972
REGIONLeading:poly(TEMPERATURE, 3)1 1.3686326 1.6384364 0.8353285 0.4035328
REGIONLeading:poly(TEMPERATURE, 3)2 -2.4701223 1.6562860 -1.4913622 0.1358664
REGIONLeading:poly(TEMPERATURE, 3)3 0.3600876 1.6536806 0.2177492 0.8276245

Anova

Chisq Df Pr(>Chisq)
REGION 1.421053 1 0.2332302
poly(TEMPERATURE, 3) 50.414204 3 0.0000000
REGION:poly(TEMPERATURE, 3) 2.933305 3 0.4020230

confint

2.5 % 97.5 % Estimate
(Intercept) -1.5625566 -1.2069755 -1.3847661
REGIONLeading -0.4278489 0.1006427 -0.1636031
poly(TEMPERATURE, 3)1 -7.0514170 -2.6585049 -4.8549609
poly(TEMPERATURE, 3)2 -4.8661279 -0.4989728 -2.6825503
poly(TEMPERATURE, 3)3 -1.0491531 3.3049927 1.1279198
REGIONLeading:poly(TEMPERATURE, 3)1 -1.8426438 4.5799090 1.3686326
REGIONLeading:poly(TEMPERATURE, 3)2 -5.7163832 0.7761385 -2.4701223
REGIONLeading:poly(TEMPERATURE, 3)3 -2.8810668 3.6012420 0.3600876
Std.Dev.(Intercept)|FISH_ID:POPULATION 0.0000000 Inf 0.0000462
Std.Dev.(Intercept)|POPULATION 0.0000024 691.9122734 0.0410140

r-squared

R2_conditional R2_marginal optional
0.265392 0.2635317 FALSE

Note that the random effects within this model are explaining very little variance, and are largely non-informative.

Pairwise comparisons

emtrends [latitudes]

pha.2.p3b.gamma %>% emtrends(var = "TEMPERATURE", type = "response") %>% pairs(by = "TEMPERATURE") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)

SCROLL TO THE RIGHT –>

The numbers in the left most column in the table just mention that the slopes are assuming mean MASS_CENTERED and RESTING_TIME_SEONDS values when looking at differences between latitudinal slopes.

emmeans [latitudes]

pha.2.p3b.gamma %>% emmeans(pairwise ~ TEMPERATURE*REGION, type = "response") %>% pairs(by = "TEMPERATURE") %>% summary(by = NULL, adjust = "tukey", infer=TRUE)

temperature

pha.2.p3b.gamma %>% emmeans(~ TEMPERATURE*REGION, type = "response")  %>% summary(infer=TRUE)

Means - f(temperature)

pha.2.p3b.gamma %>% update(.~ 1 + REGION* as.factor(TEMPERATURE) + (1|POPULATION/FISH_ID)) %>% 
  emmeans(~REGION*TEMPERATURE, type = "response") %>% summary(infer=TRUE)

Abs. diff - f(temperature)

pha.2.p3b.gamma %>% update(.~ 1 + REGION* as.factor(TEMPERATURE) + (1|POPULATION/FISH_ID)) %>% 
  emmeans(~REGION*TEMPERATURE, type = "response") %>% pairs(by ="REGION") %>% summary(infer=TRUE)

effect size [latitudes]

immuno.emm <- pha.2.p3b.gamma %>% emmeans(~REGION*TEMPERATURE)
eff_size(immuno.emm, sigma = sigma(pha.2.p3b.gamma), edf=df.residual(pha.2.p3b.gamma))
##  contrast                                                              
##  Core TEMPERATURE28.9339622641509 - Leading TEMPERATURE28.9339622641509
##  effect.size    SE  df asymp.LCL asymp.UCL
##       -0.105 0.265 Inf    -0.626     0.415
## 
## sigma used for effect sizes: 0.815 
## Confidence level used: 0.95

Summary figure

Conclusion

  • In conclusion while immunocompetence is significantly positively correlated with temperature, there is no significant difference in immunocompetence between fish from the low- and high-latitude regions.

Hematocrit

Scenario

For initial details on the experiment performed please read the ReadMe file. In breif, Acanthochromis polyacanthus from two different regions on the Great Barrier Reef (GBR) were tested for metabolic performance at four different temperatures, 27\(^\circ\)C, 28.5\(^\circ\)C, 30\(^\circ\)C, and 31.5\(^\circ\)C. Fish used in this study were collected from two different regions, low- (i.e. Cairns) and high-latitude (i.e., Mackay), within each region fish were collected from a total of three different populations.

Blood samples for hematocrit sampling were collected 2-weeks after fish underwent respiormetry testing at the final experimental temperature (31.5\(^\circ\)C). Hematocrit ratios were measured by comparing the amount of packed red blood cells to blood plasma, after blood samples collected via capillary tubes.

Read in the data

Before beginning always make sure that you are working in the correct directory

knitr::opts_knit$set(root.dir=working.dir)

Now we can import that data. Replace import data with the PATH to your data file. I have secretly labelled my PATH import.data (i.e. import.data = “PATH TO MY FILE”)

Load data

hema <- import.data

Data manipulation

Before the data can be analysed it is important to clean up the data file. Below a number of adjustments are made, primarily making sure that columns are being treated appropriately as either factors, numeric, or as time, as well as the renaming of some columns.

hema <-  hema %>% 
  mutate(PERC_RBC = as.numeric(PERC_RBC), 
         MASS = as.numeric(MASS),
         MASS_CENTERED = scale(MASS, center = TRUE, scale = FALSE)) %>% 
  drop_na(PERC_RBC)

Great! That is everything for data manipulation

Exploratory data analysis

HEMATOCRIT V LATITUDE

ggplot(hema, aes(y=PERC_RBC, x=REGION)) + 
  geom_boxplot() + 
  theme_classic() 

HEMATOCRIT V LATITUDE (distr)

hema %>% ggplot(aes(x=PERC_RBC)) + 
  geom_density() +
  facet_wrap(~REGION)

Fit the model

The model was fit using the glm and later glmmTMB package in R. A number of different models were tested to determine which hypothesis and associated variables best predicted resting oxygen consumption. Model fit was examined using AICc, BIC, and r-squared values. Additional model were examined via the validation diagnostics provided by the performance and dHARMA packages in R.

Fixed factors (linear regression models)

model 1
#--- base model ---#
hema.1 <- glm(PERC_RBC ~ REGION, 
                family = gaussian(),  
                data = hema) 
summary
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.2236353 0.0121433 18.41628 0.0000000
REGIONLeading 0.0355744 0.0181554 1.95944 0.0578398
model 2
#--- experimental rmr equipment hypothesis ---#
hema.2 <- glm(PERC_RBC ~ REGION + MASS_CENTERED, 
                 family = gaussian(), 
                 data = hema)  
summary
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.2251300 0.0137982 16.3159093 0.0000000
REGIONLeading 0.0322334 0.0230904 1.3959651 0.1715176
MASS_CENTERED -0.0003688 0.0015402 -0.2394485 0.8121545
model comparison table
model df AICc BIC r2
hema.1 3 -107.0515 -102.84462 0.0940123
hema.2 4 -104.6075 -99.26924 0.0930529

There is little difference between the two initial models, however, the model that does not include MASS_CENTERED prefers better via the model comparisons scores, therefore, we will move forward with the first and most simple model.

Random factors

Fish were only sampled once, therefore, there is no need to include individual as a random factor. However, we will test how the inclusion of POPULATION influences the model.

random factor models
hema.1 <- glmmTMB(PERC_RBC ~ REGION, 
                   family = gaussian(), 
                   REML = TRUE, 
                   data = hema) 

hema.1b <- glmmTMB(PERC_RBC ~ REGION + (1|POPULATION), 
                    family = gaussian(), 
                    REML = TRUE, 
                    data = hema) 

hema.1c <- glmmTMB(PERC_RBC ~ REGION + (REGION|POPULATION), 
                    family = gaussian(), 
                    REML = TRUE, 
                    data = hema) # convergence problem
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')
random factor model comparisons
model df AICc BIC r2m r2c
hema.1 3 -93.24011 -89.03324 0.0940123 0.0940123
hema.1b 4 -90.73388 -85.39565 0.0940123 0.0940123
hema.1c 6 -85.58393 -78.46809 0.0958933 0.1526837

The inclusion of random effects does help explain additional variation and therefore will not be included in the model. Note the final model will be run using glm and not glmmmTMB because we are not using a mixed model.

Model validation

performance

hema.1

DHARMa residuals

hema.1
hema.1 %>% simulateResiduals(plot=TRUE)

## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help. 
##  
## Scaled residual values: 0.892 0.956 0.616 0.744 0.06 0.548 0.1 0.06 0.108 0.148 0.188 0.06 0.628 0.392 0.396 0.06 0.912 0.968 0.84 0.276 ...
hema.1 %>% DHARMa::testResiduals(plot=TRUE)

## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.084842, p-value = 0.9473
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.97134, p-value = 0.952
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 0, observations = 38, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.00000000 0.09251276
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                                      0
## $uniformity
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.084842, p-value = 0.9473
## alternative hypothesis: two-sided
## 
## 
## $dispersion
## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 0.97134, p-value = 0.952
## alternative hypothesis: two.sided
## 
## 
## $outliers
## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  simulationOutput
## outliers at both margin(s) = 0, observations = 38, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
##  0.00000000 0.09251276
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 ) 
##                                                      0

The basic looks good and passes the validation checks.

Partial plots

ggemmeans

plot_model

Model investigation

summary

Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.2236353 0.0121433 18.41628 0.0000000
REGIONLeading 0.0355744 0.0181554 1.95944 0.0578398

Anova

LR Chisq Df Pr(>Chisq)
REGION 3.839405 1 0.0500613

confint

## Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) 0.1998348 0.2474358
REGIONLeading -0.0000095 0.0711583

r-squared

R2
R2 0.0963721

Pairwise comparisons

emmeans [latitudes]

hema.1 %>% emmeans(pairwise ~ REGION, type = "response")
## $emmeans
##  REGION  emmean     SE df lower.CL upper.CL
##  Core     0.224 0.0121 36    0.199    0.248
##  Leading  0.259 0.0135 36    0.232    0.287
## 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast       estimate     SE df t.ratio p.value
##  Core - Leading  -0.0356 0.0182 36  -1.959  0.0578

effect size [latitudes]

hema.emm <- emmeans(hema.1, "REGION")
eff_size(hema.emm, sigma = sigma(hema.1), edf=df.residual(hema.1))
##  contrast       effect.size    SE df lower.CL upper.CL
##  Core - Leading      -0.639 0.335 36    -1.32   0.0398
## 
## sigma used for effect sizes: 0.05565 
## Confidence level used: 0.95

Summary figure

Conclusion

  • In conclusion there is no significant difference in hematocrit ratios between A. polyacanthus from low- and high-latitude at 31.5\(^\circ\)C.

General summary

MAXIMUM METABOLIC RATE and ABSOLUTE AEROBIC SCOPE showed significant differences within the temperature response curve between fish collected from the low latitude region of the Great Barrier Reef and the sampled high-latitude (Mackay region) of the Great Barrier Reef. However no significant difference was seen in RESTING METABOLIC RATE. All oxygen consumption metrics showed significant positive relationships with temperature.

Similarly, the two enzymes that were analysed in this study showed significant positive relationships with temperature, however no significant differences between low- and high-latitude regions. Immunocompetence displayed a hill shape relationship with temperature, represented via a third order polynomial model. Immunocompetence was also significantly related to temperature but, once again no significant differences were observed between low- and high-latitude regions.

No significant differences were observed between low- and high-latitude regions in respect to hematocrit ratios, however, the difference between regions was marginally significant, pvalue =0.057, and deserves frther investigation in future research.

Further analysis of results can be see within the paper titled “[INSERT TITLE HERE]”, doi: XXXX

Figures

Figure for manuscript

Figure 2

rmr.g2 <- ggplot(rmr.emm.df, aes(y=emmean, x=TEMPERATURE, color=REGION, linetype=REGION))+
  geom_jitter(data=rmr.obs, aes(y=Fit, color=REGION), width=0.05, alpha = 0.3) +
  stat_smooth(method = "lm", 
              formula =y ~ poly(x, 2, raw=TRUE)) + 
  geom_ribbon(aes(x=TEMPERATURE, ymin= lower.CL, ymax= upper.CL, fill = REGION), 
              alpha = 0.2, color=NA)+ 
  scale_x_continuous(limits = c(26.9, 31.6), breaks = seq(27, 31.5, by = 1.5))+ 
  scale_y_continuous(limits = c(2,12), breaks = seq(2, 12, by = 2)) +
  theme_classic() + ylab(expression("RESTING METABOLIC RATE (MgO "[2]* " hr"^{-1} * ")")) + xlab("")+
  scale_linetype_manual(values = c("solid", "dashed"), labels = c("Low-latitude","High-latitude")) +
  scale_color_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) +
  scale_fill_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) + 
  theme(legend.position = "top", 
        legend.text = element_text(size = 10), 
        legend.title = element_blank(), 
        axis.title = element_text(size =12), 
        axis.text = element_text(size=10))  
  #annotate("text", x=30, y= 11.5, label="P =0.51", fontface="italic", size=5)

mmr.g2 <- ggplot(mmr.emm.df, aes(y=emmean, x=TEMPERATURE, color=REGION, linetype=REGION))+
  geom_jitter(data=mmr.obs, aes(y=Fit, color=REGION), width=0.05, alpha = 0.3) + 
  stat_smooth(method = "lm", 
              formula =y ~ poly(x, 2, raw=TRUE)) + 
  geom_ribbon(aes(x=TEMPERATURE, ymin= lower.CL, ymax= upper.CL, fill = REGION), 
              alpha = 0.2, color=NA) +
  scale_x_continuous(limits = c(26.9, 31.6), breaks = seq(27, 31.5, by = 1.5))+ 
  scale_y_continuous(limits = c(6,28), breaks = seq(6, 28, by = 2)) +
  theme_classic() + ylab(expression("MAXIMUM OXYGEN CONSUMPTION (MgO   " [2]* "  hr"^{-1} * ")"))  + xlab("")+
  scale_linetype_manual(values = c("solid", "dashed"), labels = c("Low-latitude","High-latitude")) +
  scale_color_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) + 
  scale_fill_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) + 
  theme(legend.position = 'none', 
        legend.text = element_text(size = 10), 
        legend.title = element_blank(), 
        axis.title = element_text(size =12), 
        axis.text = element_text(size=10))
  #annotate("text", x=30, y= 27, label="P =0.0010", fontface="italic", size=5) 

nas.g2 <- ggplot(nas.emm.df, aes(y=emmean, x=TEMPERATURE, color=REGION, linetype=REGION)) + 
  geom_jitter(data=nas.obs, aes(y=Fit, color=REGION), width=0.05, alpha = 0.3) +
  stat_smooth(method = "lm", 
              formula =y ~ poly(x, 2, raw=TRUE)) + 
  geom_ribbon(aes(x=TEMPERATURE, ymin= lower.CL, ymax= upper.CL, fill = REGION), 
              alpha = 0.2, color=NA)+
  scale_y_continuous(limits = c(4,20), breaks = seq(4, 20, by = 2)) + 
  scale_x_continuous(limits = c(26.9, 31.6), breaks = seq(27, 31.5, by = 1.5))+
  theme_classic() + ylab(expression("ABSOLUTE AEROBIC SCOPE (MgO "[2]* " hr"^{-1} * ")")) +
  scale_linetype_manual(values = c("solid", "dashed"), labels = c("Low-latitude","High-latitude")) +
  scale_color_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) +
  scale_fill_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) + 
  theme(legend.position = "none", 
        legend.text = element_text(size = 10), 
        legend.title = element_blank(), 
        axis.title = element_text(size =12), 
        axis.text = element_text(size=10))  
  #annotate("text", x=30, y= 19, label="P =0.0010", fontface="italic", size=5)
fig2 <- ggarrange(rmr.g2, mmr.g2, nas.g2, 
          nrow = 3, 
          ncol=1, 
          align = "v",
          labels = c("A","B","C"),
          common.legend = TRUE); fig2
## Warning: Removed 4 rows containing missing values (`geom_point()`).
Fig. 2: Thermal performance curves of resting oxygen performance (A), maximum oxygen performance (B), and absolute aerobic scope (C) of fish from low- (solid red lines) and high-latitudinal (dashed blue line) regions across four different temperatures. Ribbon represent 95% confidence intervals.
Fig. 2: Thermal performance curves of resting oxygen performance (A), maximum oxygen performance (B), and absolute aerobic scope (C) of fish from low- (solid red lines) and high-latitudinal (dashed blue line) regions across four different temperatures. Ribbon represent 95% confidence intervals.
ggsave("C:/Users/jc527762/OneDrive - James Cook University/PhD dissertation/Data/Chapter1_LocalAdaptation/figures/Figure2.pdf", fig2, device="pdf", width=6.6, height = 19.86, units = "in", dpi=1200)

Figure 3

pha.emm <- emmeans(pha.2.p3b.gamma, ~ TEMPERATURE*REGION, 
               at = list(TEMPERATURE = seq(from=27, to = 31.5, by=.1)), 
               type='response')
pha.emm.df=as.data.frame(pha.emm)

pha.obs <-  pha2 %>% 
  mutate(Pred=predict(pha.2.p3b.gamma, re.form=NA),
         Resid = residuals(pha.2.p3b.gamma, type='response'),
         Fit = Pred + Resid)

pha.g2 <- ggplot(pha.emm.df, aes(y=response, x=TEMPERATURE, color = REGION, linetype=REGION)) + 
  stat_smooth(method = "lm", se=FALSE,
              formula =y ~ poly(x, 3, raw=TRUE)) +  
  geom_ribbon(aes(x=TEMPERATURE, ymin= asymp.LCL, ymax= asymp.UCL, fill = REGION), 
              alpha = 0.2, color=NA) + 
  #scale_y_continuous(limits = c(0,0.9), breaks = seq(0, 0.9, by =0.15)) + 
  theme_classic() + ylab("PHA RESPONSE (mm)") + 
  scale_linetype_manual(values = c("solid", "dashed"), labels = c("Low-latitude","High-latitude")) +
  scale_color_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) + 
  scale_fill_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) +
  theme(legend.position = c(0.855,0.8), 
        legend.text = element_text(size = 10), 
        legend.title = element_blank(), 
        axis.title = element_text(size =12), 
        axis.text = element_text(size=10))  
  #annotate("text", x=30, y=0.495, fontface="italic", size=5, label="P =0.85")
pha.g2
Fig. 3: Thermal performance curve of swelling response of the caudal peduncle ~18-24 hours post injection of phytohemagglutinin across four different experimental temperatures. Solid red lines represent low-latitude populations. Dashed blue line represents high-latitude populations. Ribbon represents 95% confidence intervals.
Fig. 3: Thermal performance curve of swelling response of the caudal peduncle ~18-24 hours post injection of phytohemagglutinin across four different experimental temperatures. Solid red lines represent low-latitude populations. Dashed blue line represents high-latitude populations. Ribbon represents 95% confidence intervals.
ggsave("C:/Users/jc527762/OneDrive - James Cook University/PhD dissertation/Data/Chapter1_LocalAdaptation/figures/Figure3.pdf", pha.g2, device="pdf", width=6.6, height = 5, units = "in", dpi=1200)

Figure 4

#---ldh---#
ldh.emm <- emmeans(ldh.model.1.p2a, ~ temperature*REGION, 
                   at = list(temperature = seq(from=20, to = 50, by=1)))
ldh.emm.df=as.data.frame(ldh.emm)

ldh.obs <- ldh.data %>% 
  mutate(Pred = predict(ldh.model.1.p2a, re.form=NA), 
         Resid = residuals(ldh.model.1.p2a, type = 'response'), 
         Fit = Pred - Resid)

cldh2 <- ggplot(ldh.emm.df, aes(y=emmean, x=temperature, color=REGION, fill=REGION)) + 
  stat_smooth(method = "lm", se=TRUE, 
              formula =y ~ poly(x, 3, raw=TRUE)) + 
  geom_ribbon(aes(x=temperature, ymin= lower.CL, ymax= upper.CL, fill = REGION), 
              alpha = 0.2, color=NA) +
  geom_jitter(data=ldh.obs, aes(y=Fit, color=REGION), width=0.05, alpha = 0.3) +
  scale_x_continuous(limits = c(19,51), breaks = seq(20, 50, by =10)) +
  scale_y_continuous(limits = c(0,300), breaks = seq(0, 300, by =50)) + 
  theme_classic() + ylab(expression("LDH ACTIVITY (U mg "^{-1}*" tissue)")) + xlab("TEMPERATURE") +
  scale_linetype_manual(values = c("solid", "dashed"), labels = c("Low-latitude","High-latitude")) +
  scale_color_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) +
  scale_fill_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude"))+
  theme(legend.position = 'none', 
        legend.title = element_blank())

#--- cs ---#
cs.emm <- emmeans(cs.model.1a.log.p2, ~ TEMPERATURE*REGION, type='response',
                   at = list(TEMPERATURE = seq(from=20, to = 50, by=1)), 
                  )
cs.emm.df=as.data.frame(cs.emm)

cs.obs <- CS.data %>% 
  mutate(Pred = predict(cs.model.1a.log.p2, re.form=NA, type= 'response'), 
         Resid = residuals(cs.model.1a.log.p2, type = 'response'), 
         Fit = Pred - Resid)

cs.plot2 <- ggplot(cs.emm.df, aes(y=response, x=TEMPERATURE, color=REGION, fill=REGION, linetype=REGION)) + 
  stat_smooth(method = "lm", se=FALSE, 
              formula =y ~ poly(x, 2, raw=TRUE)) +  
  geom_jitter(data=cs.obs, aes(y=Fit, color=REGION), width=0.05, alpha = 0.3) +
  geom_ribbon(aes(x=TEMPERATURE, ymin= lower.CL, ymax= upper.CL, fill = REGION), 
              alpha = 0.2, color=NA) +
  scale_y_continuous(limits = c(0,10), breaks = seq(0, 10, by =2)) + 
  theme_classic() + ylab(expression("CS ACTIVITY (U mg "^{-1}*" tissue)")) + xlab("TEMPERATURE") +
  scale_linetype_manual(values = c("solid", "dashed"), labels = c("Low-latitude","High-latitude")) +
  scale_color_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) +
  scale_fill_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude"))+
  theme(legend.position = 'none', 
        legend.text = element_text(size = 10), 
        legend.title = element_blank(), 
        axis.title = element_text(size =12), 
        axis.text = element_text(size=10))
  #annotate("text", x=40, y=9.8, label="p =0.15", fontface = 'italic', size = 5)

#--- ldh:cs ratio ---#
ldh.cs.emm <- emmeans(ldh.cs.model.1a, ~ temperature*REGION, type='response',
                   at = list(temperature = seq(from=20, to = 50, by=1)), 
                  )
ldh.cs.emm.df=as.data.frame(ldh.cs.emm)

ldh.cs.obs <- ldh.cs.data %>% 
  mutate(Pred = predict(ldh.cs.model.1a, re.form=NA, type= 'response'), 
         Resid = residuals(ldh.cs.model.1a, type = 'response'), 
         Fit = Pred - Resid)

ldh.cs.plot2 <- ggplot(ldh.cs.emm.df, aes(y=emmean, x=temperature, color=REGION, fill=REGION, linetype=REGION)) + 
  stat_smooth(method = "lm", se=FALSE, 
              formula =y ~ poly(x, 2, raw=TRUE)) +  
  geom_jitter(data=ldh.cs.obs, aes(y=Fit, color=REGION), width=0.05, alpha = 0.3) +
  geom_ribbon(aes(x=temperature, ymin= lower.CL, ymax= upper.CL, fill = REGION), 
              alpha = 0.2, color=NA) +
  scale_y_continuous(limits = c(0,50), breaks = seq(0, 50, by =10)) + 
  theme_classic() + ylab("LDH:CS RATIO") + xlab("TEMPERATURE") +
  scale_linetype_manual(values = c("solid", "dashed"), labels = c("Low-latitude","High-latitude")) +
  scale_color_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) +
  scale_fill_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude"))+
  theme(legend.position = 'none', 
        legend.text = element_text(size = 10), 
        legend.title = element_blank(), 
        axis.title = element_text(size =12), 
        axis.text = element_text(size=10))
  #annotate("text", x=40, y=48, label="p =0.91", fontface = 'italic', size = 5)
fig4 <- ggarrange(cldh2, cs.plot2, ldh.cs.plot2, 
          nrow = 3, 
          ncol=1, 
          align = "v",
          labels = c("A","B","C"),
          common.legend = TRUE)

fig4
Fig.4: Thermal performance curve of maximal activity of A) lactate dehydrogenase (LDH), B) citrate synthase (CS), and C) LDH:CS ratio of low- (solid red line) and high-latitudinal (dashed blue line) populations across four experimental temperatures (i.e., 20°C, 30°C, 40°C, 50°C). Ribbons represent 95% confidence intervals.
Fig.4: Thermal performance curve of maximal activity of A) lactate dehydrogenase (LDH), B) citrate synthase (CS), and C) LDH:CS ratio of low- (solid red line) and high-latitudinal (dashed blue line) populations across four experimental temperatures (i.e., 20°C, 30°C, 40°C, 50°C). Ribbons represent 95% confidence intervals.
ggsave("C:/Users/jc527762/OneDrive - James Cook University/PhD dissertation/Data/Chapter1_LocalAdaptation/figures/Figure4.pdf", fig4, device="pdf", width=6.6, height = 19.86, units = "in", dpi=1200)

Supplemental figures for manuscript

Supplemental figure 4

hema.newdata <-  hema.1 %>% ggemmeans(~REGION) %>% 
  as.data.frame() %>% 
  dplyr::rename(REGION = x)

obs <- hema %>% 
  mutate(Pred = predict(hema.1, re.form=NA), 
         Resid = residuals(hema.1, type = "response"), 
         Fit = Pred + Resid)

hema.plot <- ggplot(hema.newdata, aes(y=predicted, x=REGION, color=REGION, linetype=REGION))  + 
  geom_jitter(data=obs, aes(y=Pred, x=REGION, color =REGION), 
              width = 0.05, alpha=0.3)+
  geom_pointrange(aes(ymin=conf.low, 
                      ymax=conf.high), 
                  shape = 19, 
                  size = 1, 
                  position = position_dodge(0.2)) + 
  scale_linetype_manual(values = c("solid", "dashed"), labels = c("Low-latitude","High-latitude")) +
  scale_color_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) +
  ylab("HEMATOCRIT RATIO") +
  scale_x_discrete(name = "", 
                   labels = c("Low-latitude","High-latitude"))+
  theme_classic() + 
  theme(legend.position = 'top', 
        legend.text = element_text(size = 10), 
        legend.title = element_blank(), 
        axis.title = element_text(size =12), 
        axis.text = element_text(size=10))   
  #annotate("text", x=1.5, y=0.275, fontface="italic", size=5, label="P =0.057")
hema.plot
Fig. 3: Comparison of hematocrit ratios, that were measured at 31.5°C, between low- (red) and high-latitudinal (blue) populations. No significant difference was observed between the different latitudes (p =0.058). Solid (low-latitude) and dashed (high-latitude) lines represent 95% confidence intervals.
Fig. 3: Comparison of hematocrit ratios, that were measured at 31.5°C, between low- (red) and high-latitudinal (blue) populations. No significant difference was observed between the different latitudes (p =0.058). Solid (low-latitude) and dashed (high-latitude) lines represent 95% confidence intervals.
ggsave("C:/Users/jc527762/OneDrive - James Cook University/PhD dissertation/Data/Chapter1_LocalAdaptation/supplemental_figures/Supplemental_figure4.pdf", hema.plot, device="pdf", width=6.6, height = 5, units = "in", dpi=1200)

Supplemental figure 3

library(ggridges)
mass.distr <- resp4 %>% distinct(FISH_ID, .keep_all = TRUE) %>% 
  mutate(CHAMBER_RATIO = 1.5/DRY_WEIGHT) %>%
  ggplot(aes(x=CHAMBER_RATIO, y=REGION, fill=REGION)) + 
  scale_fill_manual(values=c("#B2182B", "#4393C3"), labels = c("Low-latitude","High-latitude")) +
  ylab("")+ scale_x_continuous(limits = c(20,150), breaks = seq(20, 150, by =20)) + 
  geom_density_ridges(scale = 2, jittered_points=TRUE, position = position_points_jitter(height = 0),
                      point_shape = '|', point_size = 3, point_alpha = 1, alpha = 0.7) + 
  theme_classic() + 
  theme(axis.title.y=element_blank(),
        axis.text.y=element_blank(), 
        axis.ticks.y=element_blank())
mass.distr
## Picking joint bandwidth of 5.99
Fig. 3: Density plots displayed fish body size to chamber ratios. Fish that were sampled for aerobic physiology from the low-latitude region are represented in red; fish from the high-latitude region are represent in blue.
Fig. 3: Density plots displayed fish body size to chamber ratios. Fish that were sampled for aerobic physiology from the low-latitude region are represented in red; fish from the high-latitude region are represent in blue.
ggsave("C:/Users/jc527762/OneDrive - James Cook University/PhD dissertation/Data/Chapter1_LocalAdaptation/supplemental_figures/Supplemental_figure3.pdf",mass.distr, device="pdf", width=6.6, height = 5, units = "in", dpi=1200)
## Picking joint bandwidth of 5.99